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SOLUBLE. CLEAVABLE SUBSTRATES OF THF 
5 HEPATITIS r VIRUS PROTRASE 

BACKGROUND OF THE TMVFNmnM 

10 

Hepatitis C virus (HCV) is considered to be the major etiological 
agent of non-A non-B (NANB) hepatitis, chronic liver disease, and 
hepatocelliilar carcinoma (HCC) around the world- The viral infection 
accoimts for greater than 90% of transfusion -associated hepatitis in US. 
1 5 and it is the predominant form of hepatitis in adults over 40 years of 
age. Almost all of the infections result in chronic hepatitis and nearly 
20% develop liver cirrhosis. 

The virus partide has not been identified due to the lack of an 
20 efficient in vitro replication system and the extremely low amotmt of 
HCV particles in infected liver tissues or blood. However, molecular 
cloning of the viral genome has been accomplished by isolating the 
messenger RNA (mRNA) from the serum of infected chimpanzees then 
cloned using recombinant methodologies. [Grakoui A. et al /. Virol 67\ 
25 1385 - 1395 (1993)] It is now known that HCV contains a positive strand 
RNA genome comprising approximately 9400 nucleotides, whose 
organization is similar to that of flaviviruses and pestiviruses . The 
genome of HCV, like that of flavi- and pestiviruses, encodes a single 
large polyprotein of about 3000 amino adds which imdergoes proteolysis 
30 to form mature viral proteins in infected cells. 

Cell-free translation of the viral polyprotein and cell culture 
expression studies have established that the HCV polyprotein is 
processed by cellular and viral proteases to produce the putative 
35 structural and nonstructural (NS) proteins. At least nine mature viral 
proteins are produced from the polyprotein by specific proteolysis. Hie 
order and nomenclature of the deavage products are as follows: NH2-C- 

El-E2-NS2-NS3-NS4A-NS4B-NS5A-NS5B-COOH.(Fig 1). The three 
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amino terminal putative structural pxotems, C (capsid). El, and E2 ( two 
envelope glycoproteins), are believed to be deaved by host signal 
peptidases of the endoplasmic reticulum(ER) . The host enzyme is also 
responsible for generating the amino terminus of NS2 . The proteolytic 
5 p>rocessing of the nonstructural proteins are carried out by the viral 
proteases: NS2-3 and NS3, contained witiiin the viral polyprotein. The 
NS2-3 protease catalyzes the cleavage between NS2 and NS3. It is a 
metalloprotease and requires both NS2 and the protease domain of NS3. 
The NS3 protease catalyzes tiie rest of the deavages of the substrates in 

10 the nonstructural part of the polyprotein. The NS3 protein contains 631 
amino add residues and is comprised of two enzymatic domains: the 
protease domain contained within amino add residues 1-181 and a 
helicase ATPase domain contained within the rest of the protein. It is 
not known if the 70 kD NS3 protein is deaved further in infected cells to 

1 5 separate the protease domain from the helicase domain, however, no 
deavage has been observed in cell culture expression studies. 

The NS3 protease is a member of the serine dass of enzymes. It 
contains His, Asp, and Ser as the catalytic triad, Ser being the active site 
20 residue. Mutation of the Ser residue abolishes tiie deavages at substrates 
NS3/4A, NS4A/4B, NS4B/5A, and NS5A/5B. The deavage between 
NS3 and NS4A is intramolecular, whereas the deavages at NS4A/4B, 
4B/5A, 5A/5B sites occur in trms . 



25 Experiments using transient expression of various forms of HCV 

NS polyproteins in mammalian cdls have established that the NS3 
serine protease is necessary but not suffident for effident processing of 
all these deavages. Like flavivinises, the HCV NS3 protease also 
requires a cofactor to catalyze some of these deavage reactions. In 

30 addition to the serine protease NS3, the NS4A protein is absolutely 

required for the deavage of the substrate at the 4B/5A site and increases 
tfie effidency of cleavage of the substrate between 5A/5B, and possibly 
4A/4B. 



Because the HCV NS3 protease deaves the non-structural HCV 
proteins which are necessary for the HCV replication, the NS3 protease 
can be a target for the development of therapeutic agents against the 
HCV virus. The gene encoding the HCV NS3 protein has been doned 
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as disclosed in U.S. Patent No. 5371/017, however, it has not been 
expressed in soluble, active form which is useful to discover inhibitors 
of the N53 protease. Also the substrates 4A/4B, 4B/5A and 5A/5B have 
been doned but not expressed in soluble active form useful to discover 
5 inhibitor of the NS3 protease. If the HCV protease is to be useful as a 
target in a screen to discover therapeutic agents, botfi the protease and 
substrates must be in soluble active form. Thus, there is a need for a 
soluble active form of tiie HCV protease substrates which can be 
produced in large quantities to be used in a high throughput screen to 
1 0 discover inhibitors of the protease and for structural studies. 

SUMMARY OF THF TNVKIsmnM 

The present invention fills this need by providing for soluble 
1 5 HCV substrates which comprise the nonstructural polyprotein deavage 
sites of HCV. The substrate peptides are made soluble by attaching a 
solubilizing motif to the peptide. In particular the sequences of the 
substrates defined by SEQ ID NOS: 16, 17, 18, 19, 20, and 21 are daimed. 

20 BRIEF DESCRIPTION OF THF RHURRq 

Figure 1 schematically depicts the HCV polyprotein. 

Figure 2 depicts the recombinant synthesis of plasmid pBJlOlS. 

25 

Figure 3 depicts the recombinant synthesis of plasixiid pTS56-9. 

Figure 4 depicts the recombinant synthesis of plasmid pJB1006. 

30 Figure 5 depicts the recombinant synthesis of plasmid pBJ1022. 

Figure 6 depicts the recombinant synthesis of plasmid 
pNB(-V)182A4AHT. 

35 Figure 7 depicts the recombinant synthesis of plasmid pT5His/HIV/183. 



Figure 8 schematically depicts a high throughput assay for discovering 
HCV protease inhibitors using siuface plasmon resonance technology. 
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5 DFTAn.Fn nKSCRTPTTOM OF THF TMVFMTIO^J 

The teachings of all references dted are incorporated herein in 
ttieir entirety by reference. 

1 0 The present invention is soluble form of the HCV nonstructural 

polyproteins which are substrates for the HCV NS3 protease. The HCV 
NS3 protease deaves the polyprotein and separates the 4A/4B, 4B/5A^ 
and 5A/5B regions of the HCV polyprotein. One can use the 
nondeaved substrates to assay for protease iithibitors. Using the 

1 5 scintillation proximity assay or the surface plasmon assay described 

below, one can determine whether or not the HCV protease has deaved 
the substrate which is used. If the substrate is not cleaved, then the 
substance which is being tested is an HCV protease inhibitor. While on 
the other hand, if the substrate is deaved then the substance which is 

20 being tested is not a protease inhibitor. The substrates of the present 
invention are made soluble by attaching a solubilizing motif onto the 
substrate. Examples of solubilizing motifs are ionizable amino adds 
such as arginine and lysine. 



25 The substrates 5A/5B and 4B/5A can be synthesized by a suitable 

method such as by exdusive solid phase synthesis, partial solid phase 
methods, fragment condensation or dassical solution synthesis. The 
polypeptides are preferably prepared by solid phase peptide synthesis as 
described by Merrifidd, J. Am. Chem. Soc. 55:2149 (1963). The synthesis 

30 is carried out with amino adds that are protected at the alpha-amino 
terminus. Trifunctional amino adds with labile side-chains are also 
protected with suitable groups to prevent undesired chemical reactions 
from occurring during the assembly of the polypeptides. The alpha- 
amino protecting group is selectively removed to allow subsequent 

35 reaction to take place at the amino-terminus. The conditions for the 
removal of the alpha-amino protecting group do not remove the side- 
chain protecting groups. 
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The alpha-amino protecting groups are those known to 
be useful in the art of stepwise polypeptide synthesis. Included are 
acyl type protecting groups (eg., formyl, trifluoroacetyl, acetyl), aryl 
type protecting groups {e.g. , biotinyl), aromatic urethane type 
5 ' protecting groups [e.g., benzyloxycarl>onyl (Cbz), substituted 
benzyloxycarbonyl and 9-fluorenylmethyloxy-carbonyl (Fmoc)], 
aliphatic urethane protecting groups [e.g., t-butyloxycarbonyl (tBoc), 
isopropyloxycarixmyl, cydohexyloxycarbonyl] and alkyl type 
protecting groups {e.g., benzyl, triphenylmethyl). The preferred 
1 0 protecting groups are tBoc and Fmoc, thus the peptides are said to be 
synthesized by tBoc and Fmoc chemistry, respectively. 



The side-chain protecting groups selected must remain 
intact during coupling and not be removed during the deprotection 

15 of the amino-terminus protecting group or during coupling 
conditions. The side-chain protecting groups must also be 
removable upon the completion of synthesis, using reaction 
conditions that will not alter the finished polypeptide. In tBoc 
chemistry, the side-chain protecting groups for trifunctional amino 

20 adds are mostly benzyl based. In Fmoc chemistry, they are mostly 
tert.-butyl or trityl based. 



In tBoc chemistry, the preferred side-chain protecting 
groups are tosyl for Arg, cydohexyl for Asp, 4-methylbenzyl (and 

25 acetamidomethyl) for Cys, benzyl for Clu, Ser and Thr, 

benzyloxymetfiyl (and dinitrophenyl) for His, 2-Cl-benzyloxycarbonyl 
for Lys, f ormyl for Trp and 2-bromobenzyl for Tyr. In Fmoc 
chemistry, the preferred side-chain protecting groups are 2,2^7,8- 
pentamethylchroman-6-sulfonyl (Pmc) or 2A4,6,7- 

30 pentamethyldihydrobenzofuran-5-sulfonyl (Pbf) for Arg, trityl for 
Asn, Cys, Gin and His, tert-butyl for Asp, Gin, Ser, Thr and Tyr, tBoc 
for Lys and Trp. 



For the synthesis of phosphopeptides, either direct or 
35 post-assembly incorporation of the phosphate group is used. In the 
direct incorporation strategy, the phosphate group on Ser, Thr or Tyr 
may be protected by methyl, benzyl or tert.butyl in Fmoc chemistry or 
by methyl, benzyl or phenyl in tBoc chemistry. Direct incorporation 
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of phosphotyrosine without phosphate protection can also be used in 
Fmoc chemistry. In the post-assembly incorporation strategy, the 
unprotected hydroxyl group of Ser, Thr or Tyr was derivatized on 
solid phase with di-tert-butyK dibenzyl- or dimethyl-N^- 
5 diisopropylphosphoramidite and then oxidized by tert- 
butylhydroperoxide. 



Solid phase synthesis is usually carried out from the 
carboxyl-terminus by coupling the alpha-amino protected (side-chain 

10 protected) amino add to a suitable solid support. An ester linkage is 
formed when the attachment is made to a chloromethyt chlortrityl 
or hydroxymethyl resin, and the resulting polypeptide will have a 
free carboxyl group at the C-terminus, Alternatively, when an amide 
resin such as benzhydrylamine or p-methylbenzhydrylamine resin 

1 5 (for tBoc chemistry) and Rink amide or PAL resin (for Fmoc 
chemistry) is used, an amide bond is formed and the resulting 
polypeptide will have a carboxamide group at the C-terminus. These 
resins, whether polystyrene- or polyamide-based or 
polyethyleneglycol-grafted, with or without a handle or linker, with 

20 or without the first amiiu> add attached, are commerdally available, 
and their preparations have been described by Stewart et al (1984)., 
''Solid Phase Peptide Synthesis'" (2nd Edition), Pierce Chemical Co., 
Rockford, IL.; and Bayer & Rapp (1986) Chem. Pept. Prot. 3, 3; and 
Atherton, et al. (1989) Solid Phase Peptide Synthesis: A Practical 

25 Approach, IRL Press, Oxford. 



The C-tenninal amino add, protected at the side-chain if 
necessary and at the alpha-amino group, is attached to a 
hydroxylmethyl resin using various activating agents induding 

30 dicydohexylcarbodiimide (DCC), NJsT-diisopropylcarbodiimide 
DIPCDI) and carbonyldiimidazole (CDI). It can be attached to 
chloromethyl or chlorotrityl resin directfy in its cesium 
tetramethylammoniiun salt form or in the presence of triethylamine 
(TEA) or diisopropylethylamine (DIEA). Rrst amino add 

35 attachment to an amide resin is the same as amide bond formation 
during coupling reactions 
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FoUowing the attachment to the resin support, the alpha- 
amino protecting group is removed using varioxis reagents 
depending on the protecting chemistry {e.g. , tBoc, Fmoc). The extent 
of Fmoc removal cai\ be monitored at 300-320 run or by a 
5 conductivity cell. After removal of the alpha-amino protecting 
group, the remaining protected amino adds are coupled stepwise in 
the required order to obtain the desired sequence. 



Various activating agents can be used for the coupling 
10 reactions including DCQ DIPCDI, 2-chloro-13-dimethylimidium 
hexafluorophosphate (OP), benzotriazol-l-yl-oxy-tris- 
(dimethylamino)-phosphonium hexafluorophosphate (BOP) and its 
pyrroUdine analog (FyBOP), bromo-tris-pyrroUdino-phosphonium 
hexafluorophosphate (PyBroP), O -(benzotriazol-l-yl)-!,!^^- 

15 tetramethyluronium hexafluorophosphate (HBTU) and its 

tetrafluorol>orate analog (TBTU) or its pyrrolidine analog (HBPyU), 
O -(7-azabenzotriazoH-yl)-143>tetramethyluronium 
hexafluorophosphate (HATU) and its tetrafluoroborate aiialog 
CTATU) or pyrrolidine analog (HAPyU). The most common catalytic 

20 additives used in coupling reactions include 4- 

dimethylaminopyridine PMAP), 3-hydroxy-3,4-dihydro4-oxo-l,23- 
benzotriazine (HODhbt), N-hydroxybenzotriazole (HOBt) and 1- 
hydroxy-7-azabenzotriazole (HOAt). Each protected amino add is 
used in excess (>2,0 equivalents), and the couplings are usually 

25 carried out in N-methylpyrrolidone (NMP) or in DMF, CH2CI2 or 
mixtures thereof. The extent of completion of the coupling reaction 
can be monitored at each stage, e.g„ by the ninhydrin reaction as 
described by Kaiser et al, And. Biochem. 34595 (1970). In cases 
where incomplete coupling is foimd, the coupling reaction is 

30 extended and repeated and may have chaotropic salts added. The 
coupling reactions can be performed automatically with 
commercially available instruments such as ABI model 430A, 431A 
and 433A peptide synthesizers. 

35 After the entire assembly of the desired peptide, tiie peptide- 

resin is deaved with a reagent with proper scavengers. The Fmoc 
peptides are usually deaved and deprotected by TFA with scavengers 
(e.g., H2O, ethanedithiol, phenol and thioanisole). The tBoc peptides 
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are usually cleaved and deprotected with liquid HF for 1-2 hours at -5 
to 0*C which cleaves the polypeptide from the resin and removes 
most of the side-chain protecting groups. Scavengers such as anisole, 
dimethylsulfide and p-thiocresol are usually used with the liquid HF 
5 to prevent cations formed during the cleavage from alkylating and 
acylating the amino add residues present in the polypeptide. The 
foxmyl group of Trp and dinitrophenyl group of His need to be 
removed, respectively, by piperidine and thiophenol in DMF prior 
to the HF cleavage. The acetamidomethyl group of Cys can be 
1 0 removed by mercury (II) acetate and alternatively by iodine, thallium 
(m) trifluoroacetate or silver tetrafluoroborate which simultaneously 
oxidize cysteine to cystine. Other strong adds used for tBoc peptide 
deavage and deprotection indude trifluoromethanesulfonic add 
(TFMSA) and trimethylsilyltrifluoroacetate (TMSOTf). 

15 

Recombinant DNA methodology can also be used to prepare 
the polypeptide substrates. The known genetic code, tailored if 
desired with known preferred codons for more effident expression 
in a given host organism, can be used to synthesize oligonudeotides 
20 encoding the desired amino add sequences. The phosphoramidite 
soUd support method of Matteucd et al ,/. Anu Chem. Soc 103:3185 
(1981) or other known methods can be used for such syntheses. The 
resulting oligonudeotides can be inserted into an appropriate vector 
and expressed in a compatible host organism. 

25 

The peptides of the invention can be purified using HFLC, gel 
filtration, ion exchange and partition chromatography, coimtercurrent 
distribution or other well known methods. 

30 Also disdosed is the production of the HCV NS3 protease in a 

solublefbrm. The HCV N53 protease must be in a soluble form to be 
used in a screen to detect compounds which inhibit the protease from 
deaving it's target substrate. We have discovered that if a peptide 
containing a solubilizing motif is attached to either the NS3 protease, 

35 preferably to tfie carboxyl terminus, the NS3 protease becomes readily 
soluble. 
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The amino add sequence of the NS3 protease catalytic domain is 
shown in SEQ ID NO: 1. Prior to the present invention the NS3 
protease was not expressed in a ceil in a soluble form in sufficient 
quantities for extraction and purification. Moreover, soluble HCV NS3 

5 protease was rurt able to be produced in soluble form in bacteria. This is 
important because bacterial expression is the preferred method of 
expression of large quantities of HCV protease. Soluble HCV NS3 
protease of the present invention can be produced in several ways. A 
solubilizing motif can be fused to the protein resulting in a soluble 

10 protein. A solubilizing motif is any chemical moiety boimd to the HCV 
NS3 protease which results in the NS3 protease becoming soluble in a 
buffered solution. Examples of such solubilizing motifs are chains of 
amino adds having polar side chains, preferably positively charged 
amino adds. The chain of amino adds should be about 4-10 amino 

15 add residues in length. The preferred amino adds are arginine and 
lysine. Another example of a solubilizing motif is an amphipathic 
moiety. The solubilizing motif can be fused to either the amino 
terminus or carboxy terminus of the NS3 protease. A sequence which 
has been successfuDy fused to the carboxyl terminus to produce soluble 

20 NS3 protease is -Arg - Lys - Lys - Lys - Arg - Arg- (SEQ ID NO: 2). This 
has been fused to the carboxyl end of the NS3 protease to produa the 
polypeptides of SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 8 and SEQ ID 
NO: 27. Other examples of soluble HCV NS3 protease having a 
hydrophilic amino add residue tail which were made are SEQ ID NO: 9, 

25 and SEQ ID Na 10. 

Soluble HCV NS3 protease can also be produced which does not 
have a solubilizing motif as for example the proteases shown in SEQ ID 
NO: 1 and SEQ ID NQ 7. Preferably the NS3 protease will have a 
30 histidine tag fused to its amino acid terminus for use in purifying ttie 
protein on a nickel (Ni2+) coated resin. See SEQ ID NO: 5. In this 
embodiment the protease is produced as insoluble aggregates or as 
indusion bodies in bacteria such as in E. coli. 

35 The insoluble HCV NS3 protease is first extracted from the 

bacteria by homogenization or sonication of the bacteria. The aggregates 
containing the bacteria are flien solubilized in a 5 M solution of 
guanidine hydrochloride (GuHCl). The NS3 protease is then purified 
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£rom high molecular weight aggregates by size exclusion 
chromatography^ as for example by applying the solution to a 
SEPHACRYL S-300 size exclusion gel coliunn. Fractions containing the 
NS3 protease in 5 M GuQ are pooled and diluted to about 0.1 M GuHQ 
5 in a refolding buffer comprised of dithiothreitol and lauryl maltoside. 
The diluted solution is then applied to a reverse phase chromatography 
column and pools containing the NS3 protease collected. The pH of the 
protease fractions is then raised in a stepwise maimer to about 7.4 so as 
to produce properly refolded soluble, active NS3 protease. 

10 

It has also been discovered that tiie HCV NS3 protease is much 
more effective in cleaving the HCV non-structural proteins, if the co- 
factor NS4A protein is present (SEQ ID NO: 6). Accordingly, the present 
invention is also comprised of a fusion of the NS4A cofactor domain 
15 protein with the NS3 protease, in particular the fusion of the NS3 

protease and the NS4A cofactor wherein the NS4A is mutated such the 
NS3 protease and the NS4A cofactor is not cleaved by the NS3 protease. 
Examples of the fused NS3 and NS4A constructs are shown in SEQ ID 
NOs, 7, 8, 9, 10 and 27. 

20 

DNA encoding the NS3 protease of this invention can be 
prepared by chemical synthesis using the known nucleic add 
sequence [Ratner et al.. Nucleic Adds Res. 13:5007 (1985)] and 
standard methods such as the phosphoramidite solid support 

25 method of Matteucd aal.fj Am. Chem. Soc. 1033185 (1981)] or the 
method of Yoo et al [J. Biol. Chem. 764:17078 (1989)]. See also Click, 
Bernard R. and Pasternak, Molecular Biotechnology : pages 55-63, 
(ASM Press, Washington, D.C 1994). The gene encoding the protease 
can also be obtained using the plasmid disdosed in Grakoui, A., 

30 Wychowski, C, Lin, C, Fdnstone, S. M., and Rice, C. M., Expression 
and Identification of Hepatitis C Virus polyprotein Qeavage 
Products, /. Virol 67;1385-1395 (1993). Also, the nuddc add encoding 
HCV protease can be isolated, amplified and doned (from patients 
infected with the HCV virus). Furthermore, the HCV genome has 

35 been disdosed in PCX WO 89/04669 and are available fit>m tiie 

American Type Culture CoDection (ATCC), 12301 Parklawn Drive, 
Rockville, MD imder ATCC accession no. 40394. 
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Of course, because of the degeneracy of the genetic code, there 
are many functionally equivalent nucleic add sequences that can 
encode mature human HCV protease as defined herein. Such 
functionally equivalent sequences, which can readily be prepared 
5 using known methods such as chemical synthesis, PCR employing 
modified primers and site-directed mutagenesis, are within the scope 
of this invention. 

Various expression vectors can be used to expr^ DNA 
1 0 encoding HCV NS3 protease. Conventional vectors used for expression 
of recombinant proteins used for expression of recombinant proteins in 
prokaryotic or eukaryotic cells may be used. Preferred vectors include 
the pcD vectors described by Okayama et al, Mol. Cell. Bin, Vol. 3: 280- 
289 (1983); and Takebe et al, Mol. Cell. Binl. Vol. 8: 466-472 (1988). Other 
1 5 SV40^based mammalian expression vectors include those disclosed in 
Kaufman et aU Mol. CelL Binl Vol. 2: 1304-1319 (1982) and U5. Patent 
No. 4,675,285. These SV40-based vectors are particularly useful in COS7 
monkey cells (ATCC No. CRL 1651), as well as in other mammalian ceUs 
such as mouse L cells and CHO cells. 

20 

Standard transfection methods can l>e used to produce eukaryotic 
cell lines which express large quantities of the polypeptide. Eukaryotic 
cell lines include mammalian, yeast and insect cell lines. Exemplary 
mammalian cell lines include COS-7 cells, mouse L cells and Chinese 
25 Hamster Ovary (CHO) cells. See Sambrook rt a/., supra and Ausubel et 
ah, supra. 

As xised herein, the term "transformed bacteria" means bacteria 
that have been genetically engineered to produce a mammalian protein. 

30 Such genetic engineering usually entails the introduction of an 

expression vector into a bacterium. The expression vector is capable of 
autonomous replication and protein expression relative to genes in the 
bacterial genome. Construction of bacterial expression is well known in 
the art, provided the nucleotide sequence encoding a desired protein is 

35 known or otherwise available. For example, DeBoer in U.S. Pat. No. 
4,551,433 discloses promoters for use in bacterial expression vectors; 
Goeddel el al in U.S. Pat No. 4,601,980 and Riggs, in US. Pat No. 
4,431,739 disclose the production of mammalian proteins by E. coli 
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expression systems; and Riggs supra, Ferretti et al. Proc. Natl Acad. 
Sci.83:599 (1986), Sproat et al.. Nucleic Add Research 13:2959 (1985) and 
Mullenbach et aL, J. Biol Chem 261:719 (1986) disclose how to construct 
synthetic genes for expression in bacteria. Many bacterial expression 
5 vectors are available commercially and flirough the American Type 
Culture CoUection (ATCQ, Rockville, Maryland. 



Insertion of DNA encoding human HCV protease into a 
vector is easily accomplished when the termini of both the DNA and 

10 the vector comprise the same restriction site. If this is not the case, it 
may be necessary to modify the termini of the DNA and/or vector by 
digesting back single-stranded DNA overhangs generated by 
restriction endonudease cleavage to produce blunt ends, or to 
achieve the same result by filling in the single-stranded termini with 

15 an appropriate DNA polymerase. Alternatively, any site desired may 
be produced by ligating nucleotide sequences (linkers) onto the 
termini. Such linkers may comprise specific oligonucleotide 
sequences that define desired restriction sites. The cleaved vector 
and the DNA fragments may also be modified if required by 

20 homopolymeric tailing. 

Many £. co/i-compatible expression vectors can be used to 
produce soluble HCV NS3 protease, including but not limited to 
vectors containing bacterial or bacteriophage promoters such as the 
Tac, lac, Trp, LacUVS, 1 Pr and 1 Pl promoters. Preferably, a vector 

25 selected will have expression control sequences tiiat permit 
regulation of the rate of HCV protease expression. Then, HCV 
protease production can be regulated to avoid overproduction that 
could prove toxic to the host cells. Most preferred is a vector 
comprising, from 5' to 3' (upstream to downstream), a Tac promoter, 

30 a lac M repressor gene and DNA encoding mature human HCV 
protease. The vectors chosen for use in fliis invention may also 
encode secretory leaders such as the ompA or protein A leader, as . 
long as such leaders are deaved during post-translational processing 
to produce mature HCV protease or if the leaders are not cleaved, the 

35 leaders do not interfere with the enzymatic activity of the protease. 
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Fusion pq^tides will typically be made by either recombinant 
nudeic add methods or by synthetic polypeptide methods. Techniques 
for nudeic add manipulation and expression are described generally, 
e.g., in Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual 
5 (2d ed.), vols. 1-3, Cold Spring Harbor Laboratory; and Ausubel, et al. 
(eds.) (1993) Current Protocok in Molecular Biology, Greene and Wiley, 
NY. Techniques for synthesis of polypeptides are described, eg., in 
Merrifield (1963) /. Amer. Chem. Soc. 55:2149-2156; Merrifield (1986) 
Science 232: 341-347; and Stewart et al (1984)., ''Solid Phase Peptide 
10 Synthesis" (2nd Edition), Pierce Chemical Co., Rockford, IL.; and 
Atherton, et al. (1989) Solid Phase Peptide Synthesis: A Practical 
Approach, IRL Press, Oxford; and Grant (1992) Synthetic Peptides: A 
User's Guide, W.R Freeman, NY. 



15 

One can use the NS3 protease, the NS4 cofactor and the peptide 
substrates, either 4B/5A or 5A/5B, to develop high throughput assays. 
These can be used to screen for impounds which inhibit proteolytic 
activity of the protease. One does this by developing techniques for 

20 determining whether or not a compotmd will inhibit the NS3 protease 
from deaving the viral substrates. Examples of such synthetic substrates 
are SEQ ID NOs 16, 17, 18, 19, 20 and 21. If the substrates are not deaved, 
the virus cannot replicate. One example of such a high throughput 
assay is the scintillation proximity assay (SPA). SPA technology 

25 involves the use of beads coated with sdntillant. Bound to the beads are 
acceptor molecules such as antibodies, receptors or enzyme substrates 
which interact with ligands or enzymes in a reversible maimer. 

For a typical protease assay the substrate peptide is biotinylated at 
30 one end and the other end is radio]at>elled with low energy emitters 
such as 125i or 3h. The labeled substrate is then incubated with the 
er^yme. Avidin coated SPA beads are then added which bind to the 
biotiit When the substrate p)eptide is deaved by the protease, the 
radioactive emitter is no longer in proximity to the sdntillant bead and 
35 no light emission takes place. Inhibitors of the protease will leave the 
substrate intact and can be identified by the resulting light emission 
which takes place in their presence. 
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Another type of protease assay, utilizes the phenomenon of 
surface plasmon resonance (SPR). A novel, high throughput enzymatic 
assay utilizing surface plasmon resonance technology has been 
successfully developed. Using this assay, and a dedicated BIAcore™ 
5 instrument, at least 1000 samples per week can be screened for either 
their enzymatic activity or their inhibitory effects toward the enzymatic 
activity, in a 96 well plate format This methodology is readily adaptable 
to any enzyme-substrate reactioa The advantage of this assay over the 
SPA assay is that it does not require a radiolabeled peptide substrate. 

10 

The following examples are included to illustrate the present 
invention but not to limit it. 

15 Example 1 

Production of HCV Mg^ Prnte^co 

A. Plasmid constructions. 

20 

Several plasmids were designed and constructed using standard 
recombinant DNA techniques (Sambrook,Fritsch & Maniatis) to express 
the HCV protease in E. coK (Fig 2-7). All HCV specific sequences 
originated from the parental plasmid pBRTM/HCV 1-3011 (Grakoui et 
25 fl/.1993). To express the N-terminal 183 amino acid versions of the 
protease, a stop codon was inserted into the HCV genome using 
synthetic oUgonudeotides (Fig. 3). The plasmids designed to express the 
N-terminal 246 amino add residues were generated by the natural Ncol 
restriction site at the C-terminus. 

30 

i) Construction of the plasmid pBJlOlS (Figure 2) 

The plasmid pBRTM/HCV 1-3011 containing the entire HCV genome 
(Grakoui A., et al., J. ViroL 67: 1385-1395) was digested with the 
35 restriction enzymes Sea I and Hpa I and the 7138 bp (base pair) DNA 

fragment was isolated and doned to the Sma I site of pSP72 (Promega) to 
produce the plasmid>pRJ201. The plasmid pRJ 201 was digested with 
Msc I and the 2106 bp Msc I fragment was isolated and doned into the 
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Sma I site of the plasmid pBD7. The resulting plasmid pMBM48 was 
digested with Kas I and Nco I, and the 734 bp DNA fragment after blunt 
ending with Klenow polymerase was isolated and doned into Nco I 
digested, klenow polymerase treated pTrc HIS B seq expression plasmid 
5 (Invitrogen). The ligation regenerated a Nco I site at the 5' end and Nsi I 
site at the 3* end of HCV sequence. The plasmid pTHB HCV NS3 was 
then digested witti Nco I and Nsi I, and treated with klenow polymerase 
and T4 DNA polymerase, to produce a blunt ended 738 bp DNA 
fragment which was isolated and cloned into Asp I cut, klenow 
1 0 polymerase treated expression plasmid pQE30 (HIV). The resulting 
plasmid pBJ 1015 expresses HCV NS3 (246 amino adds) protease. 

(ii) Construction of the plasmid pTS 56-9 with a stop codon after 
amino add 183 (Figure 3) 

15 

The plasmid pTHB HCV NS3 was digested with Nco I, treated 
with klenow polymerase, then digested with Bst Y I; and the DNA 
fragment containing HCV sequence was isolated and doned into Sma I 
and Bgl n digested pSP72. The resulting plasmid pTS 49-27 was then 
20 digested with Bgl n and Hpa I and ligated with a double stranded 
oligonudeotide: 

GA TCA CCG CTC TAG ATCT 

T GGC CAc ATc TAGA (SEQ ID NO 11) to produce pTS 56^. 
Thus, a stop codon was placed directly at the end of DNA encoding the 
25 protease catalytic domain of the NS3 proteiiv This enabled the HCV 
protease to be expressed independently from the helicase domain of the 
NS3 protein. 

(iii) Construction of the plasmid pJB 1006 Fused with a peptide of 
30 positively charged amino adds at the caiboxy terminus of NS3 183 

(Figure 4). 

The plasmid pTS 56-9 was digested with Sph I and Bgl n and the DNA 
fragment containing HCV sequence was isolated and doned into a Sph I, 
35 Bgl n cut pSP72. The resulting plasmid pJB 1002 digested with Age I and 
Hpal and ligated to a double stranded oligonudeotide, 

CCG GTC CGG AAG AAA AAG AGA CGC TAG C 
AG GCC TTC TTT TTC TCT GCG ATC G 
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(SEQ ID NO 12), to construct pJB 1006. This fused the hydrophiKc, 
sdubilizing motif onto the NS3 protease. 



5 (iv) Construction of the plasmid pBJ 1022 expressing His-NS3(183)-HT 
in E.coli (Figure 5) 

The plasmid pJB 1006 was digested with NgoM I and Nhe I and the 216 
bp DNA fragment was isolated and cloned into Ngo M I, Nhe I cut pBJ 

1 0 1015 to construct plasmid pBJ 1019. The plasmid pBJ 1019 was digested 
witii Nar I and Pvu H, and treated with Klenow polymerase to fill in 5* 
ends of Nar I fragments. The expression plasmid pQE31 (Invitrogen) was 
digested with BamH I, blunt ended with Klenow polymerase. The 717 bp 
Nar I- Pvu n DNA fragment was isolated and ligated to the 2787 bp 

1 5 BamH I/Klenowed -Msc I (Bal I) iPragment of the expression plasmid 
pQE31 (Invitrogen). The recombinant plasmid, pBJ 1022, obtained after 
transformation into Exoli expresses His NS3(2-183)-HT which does not 
contain any HIV protease deavagc site sequence. The plasmid also 
contains a large deletion in the CAT (Chloramphenicol Acetyl 

20 Transferase) gene. 

(v) Construction of the plasmid pNB(-V)182-A4A HT (Figure 6) 

The plasmid pMBM 48 was dig^ted with Eag I and Xho I, treated with 
25 Klenow polymerase and the 320 bp DNA fragment was isolated and 
doned into BamH I cut , blunt ended pSP 72 to construct the plasmid 
PJB1004. The 320 bp fragment encodes 7 amino add from carboxy 
terminal of NS3(631), all of N54A, and the amino terminal 46 amino 
add of NS4B. The recombinant plasmid pJB1004 was digested with Eag I 
30 and Cel 2, blunt ended with Klenow polymerase. The 220 bp DNA 
fragment was isolated and doned into the expression plasmid pQESO 
which was digested with BamH I and blimt ended with Klenow 
polymerase prior to ligation. The resulting plasmid pJB 1011 was 
digested with NgoM I and I£nd D1 and ligated to a double stranded 
35 oligonudeotide , 

CCC GCA ATT ATA OCT GAC AGG GA»: GTT CTC TAG GAG GAA TTC 
GT TAA TAT GGA CTG TOO C^v CAA GAG ATG GTC CTT AAG 
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GAT GAG ATG GAA GAG TGC COG AAG AAA AAG AGA CX3C A 

CTA CTC TAG CTT CTC ACG CCC TTC TTT TTC TCT GCG TTC GA 

(SEQ ID NO 13) 

5 

to construct the plasmid pNB 4A HT. The plasmid pNB 4AHT was 
digested with MslI and Xba 1. The 1218 bp DNA fragment was isolated 
and doned into Age I cut, klenow polymerase treated, Xba I cut vector 
DNA of pBJ 1019. The ligation results in a substitution of the 183rd 

1 0 amino add residue valine by a glycine residue in NS3, and a deletion of 
amino terminal three amino add residues of NS4A at the jimction. The 
recombinant plasmid pNB182A4A HT comprising NS3(182aa)-G- 
NS4A(4-54 amino add) does not contain NS3/NS4A deavage site 
sequence at the junction and is nol cleaved by the autocatalytic activity 

1 5 of NS3. Finally the plasmid pNBlH2A4A HT (SEQ ID NO 8) was digested 
with Stu I and Nhe I, the 803 bp DNA fragment was isolated and doned 
into Stu I and Nhe I cut plasmid pBJ 1022. The resulting plasmid pNB(- 
V)182-A4A HT contains a deletion of the HIV sequence from the axnino 
terminus end of the N53 sequence and in the CAT gene (SEQ ID NO 27). 

20 

(vi) Construction of the nlasmid pTC Wi^ WTV- Ng.^ rPigiirp 7) 

The plasmid pTS56-9 was digested with Bgl and treated with 
Klenow polymerase to fill in 5' ends. The plasmid was then digested 
25 with NgoM I and the blunt ended Bgl H/NgoMI fragment containing 
the NS3 sequence was isolated and ligated to the SgU, Klenow treated 
NgmMI cut and Sal I klenowed pUJ 1015. The resulting plasmid is 
designated pTSHis HIV 183. 

30 Example 2 

Purification of HCV NS3 ProtPase having A qoliihili^^np Mntif 

Purification of HislR2HT rg KO ID NO 4^ anH 
35 His f-VH82A4AHT rSFO TO MO K\ 



The recombinant plasmids pBJ1022 and pNB(-V)182A4A were 
used to transform separate cultures of E. coli strain M15 [pREP4] 
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(Qiagen), which over-expresses the lac repressor, according to methods 
recommended by the manufacturer. M15 [pR£P4] bacteria harboring 
recombinant plasmids were grown overnight in broth containing 20g/L 
bactotrypton, lOg A- bacto-yeast extract, 5g/L NaQ (20-10-5 broth) and 
5 supplemented with lOO^g/ml ampidllin and 25^g/ml kanamydiu 
Cultures were diluted down to O.D.600 of 0.1, then grown at 30**C to 
O.D.600 of 0.6 to 0.8, after which IPTG was added to a final concentration 
of ImM. At post-induction 2 to 3 hours, the cells were harvested by 
pelleting, and the cell pellets were washed with lOOmM Tris, pH 75. Cell 

1 0 lysates were prepared as follows: to each ml equivalent of pelleted 
fermentation broth was added 50^1 sonication buffer (50mM sodium 
phosphate, pH 7£, 0.3M NaQ) with Img/ml lysozyme; cell suspension 
was placed on ice for 30 min. Suspension was then brought to a final 
concentration of 0.2% Tween-20, 1 OmM dithiothreitol (DTI), and 

1 5 sonicated until cell breakage was complete. Insoluble material was 
pelleted at 12,000 x g in a microcentrifuge for 15 minutes, the soluble 
portion was removed to a separate* tube and tiie soluble lysate was then 
brought to a final concentration of 10% glycerol Soluble lysates from 
cells expr^sing the plasmids produce strongly immxmoreactive bands of 

20 tfie predicted molecular weight. Soluble lysates prepared for Ni2+ 
column purification were prepared with lOmM P-mercaptoethanol 
(BME) instead of DTT. Lysates were stored at -80**C . 

25 Purification using Ni^-^-Nitrosyl acpHr add (NTA\ agarose rOIACrFKn 

The proteins were then purified by placing the extracted lysate on 
an NTA agarose column. NTA agarose column chromatography was 
used because the histidine tag which was fused to the N-terminus of the 

30 proteases readily binds to the nickel column. This produces a powerful 
affiiuty chromatographic technique* for rapidly purifying the soluble 
protease. The column chromatography was performed in a batch mode. 
The Ni2+ NTA resin (3ml) was washed twice with 50 ml of Buffer A ( 
50mM sodiiun phosphate pH 7.8 containing 10% glycerol, 0.2% Tween- 

35 20, lOmM BME). The lysate obtained from a 250 ml fermentation (12.5 
ml) was incubated with the resin for one hour at 4**C. The flow through 
was collected by centrifugation. The resin was packed into a 1.0 x 4 cm 
column and washed with buffer A until the baseline was reached. The 
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bound protein was then eluted with a 20 ml gradient of imidazole (0- 
0.5M) in buffer A. Eluted fractions were evaluated by SDS-PAGE and 
western blot analysis using a rabbit polyclonal antibody to His-HIV 183. 

5 Purification using POROS metal-chglate affinity rn)iimn 

In an alternative method to purify the proteins the lysate contaiiung the 
proteins were applied to a POROS metal-chelate affinity column. 
Perfusion chromatography was performed on a POROS MC metal 

10 chelate column (4.6 x 50mm, 1.7 ml) precharged witti Ni2+. The sample 
was applied at 10 ml/min and the column was washed with buffer A. 
The column was step eluted with ten column volumes of bu^r A 
contaiiung 25 mM imidazole. The column was further eluted with a 25 
coltunn voliune gradient of 25-250 mM imidazole in buffer A. All 

1 5 eluted fractions were evaluated by SDS-PAGE and western blot analysis 
using rabbit polydoiuil antibody. 



20 £xam|2lfi^ 

Peptide Svnthesis of the 5 A /TO an d 4B/5A Substrat^K 

The peptides 5A/5B and 4B/5A substrates (SEQ ID NOs 16, 18, 19, 20 and 
25 21) were synthesized using Fmoc chemistry ori an ABI model 431 A 
peptide synthesizer. The manufacture recommended FastMoc™ 
activation strategy (HBTU/HOBt) was used for the synthesis of 4A 
activator peptide. A more powerful activator, HATU with or without 
the additive HOAt were employed to assemble 5A/5B substrate peptides 
30 on a preloaded Wang resin. The peptides were cleaved off the resin and 
deprotected by standard TFA cleavage protocol. The peptides were 
purified on reverse phase HPLC and confirmed by mass spectrometric 
analysis. 

35 
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Examplg 4 

HPLC-assav usin^ a svnthpHr 5A/5R p^pfide substratP 

5 To test the proteolytic activity of the HCV NS3 protease the 

DTEDWCC SMSYTVVTGK (SEQ ID NO 16) aiui soluble HCV NS3 (SEQ 
ID NO 27) were placed together in an assay tmffer. The assay buffer was 
SOmM sodium phosphate pH 7.8, containing 15% glycerol, lOmM DTT, 
0J1% Tween20 and 200 mM NaCI). The protease activity of SEQ ID NO 

10 27 cleaved the substrate into two byproduct peptides, itamely 5A and 5B. 
The substrate and two byproduct peptides were separated on a reversed- 
phase HPLC column. (Dynamax, 4.5 x 250 mm) with a pore size of 300A 
and a particle size of 5nm. The column was equilibrated with 0.1%TFA 
(Solvent A) at a flow rate of 1 ml per minute. The substrate and the 

1 5 product peptide standards were applied to the colimm equilibrated in A. 
Elution was performed with a acetonitrile gradient (Solvent B=100% 
acetonitrile in A). Two gradients were used for elution (5% to 70%B in 
50 minutes followed by 70% to 100%B in 10 minutes). 

20 In another experiment, partially purified SEQ ID NO 27 or vector 

control was incubatecl with lOOfiM of substrate for 3, 7 and 24 hours at 
30®C. The reaction mixture was quenched by the addition of TFA to 
0.01% and applied to the reversed-phase HPLC column. The fractions 
from each nm were evaluated by mass spectrometry and sequencing. 

25 



Example 5 

30 Analysis of NS3 Protease Activity By In Vitro Translation A«Qay 

To detect HCV NS3 protease activity in trms, we have expressed a 
40 kD protein containing the NS5A/5B deavage site in cell-free 
translation system and used that «is the substrate for the enzyme. The 
35 substrate protein produces two protein products of apparent molecular 
weight 12.5 kD (NS 5A') and 27 kD (NS5B') upon cleavage by the HCV 
N53 protease. 
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The plasmid pTS102 encoding the substrate 5A/5B was linearized 
by digestion with EcoR I and was transcribed using T7 RNA polymerase 
m vitro. The UNA was translated in presence of methionine in 
rabbit reticulocyte lysates according to the manufacturer's (Promega ) 
5 protocol to produce HCV specific protein. In a 20 total reaction 
mixture containing lOmM Tris, pH 73, ImM DTI, 0.5mM EDTA, and 
10% glycerol was placed 2 to 8 jil of mettiionine-labeled translated 
5A/5B substrate. The reaction was started with the addition of lOjU of 
HCV N53 protease in solubilization buffer (50mM Na Phosphate, pH 

1 0 7.8, 0.3M NaQ, 0.2% Tween 20, 10 mM DTT or BME, 10% glycerol), and 
incubated at 30**C for the specified time. Reactions were stopped by 
adding an equal volxmie of 2X Lapmmli sample buffer (Enprotech Inc.) 
and heating at 100**C for 3 minuten. Reaction products were separated by 
SDS PAGE electrophoresis; gels wiTe fixed, dried and subjected to 

15 autoradiography. 

Tlie in vitro translated substrate was used to assay the HCV NS3 
proteases expressed by E. coH harboring plasmids pBJ1022 and 
pNB(-V)182A4A (SEQ ID NOs 4 and 27). In a two hour assay incubated at 

20 SO^'C, pBJ1022 crude soluble lysate at 3, 6, and lOjd, was able to cleave 
5A/5B substrate in a dose responsive manner, producing the expected 
deaved products: 5A (125 kD) and 5B (27 kD) as shown by SDS PAGE 
analysis. Corresponding vector control lysate did not show any cleavage 
activity over background. The crude soluble lysate derived from 

25 pNB182A4A was much more active in this assay. After only 30 minutes 
incubation, the 5A and 5B cleavage products were detected using as little 
as 0.125^1 cell lysate, with increasing amounts of lysate showing 
increased cleavage, reaching a maximum at Ijil. 

30 We performed a time course study of the NS3 Protease activity of 

pNB182A4A in an m vitro translation assay for further characterization 
of the activity. At 30*'C, in a reaction containing the translated 5A/5B 
substrate plus pNB182A4A soluble lysate at 1^1 per 20^1 reaction 
volume, the 5A and 5B cleavage products appeared beginning at 1 

35 minute, and increased with time at 2.5, 5, 10, and 20 minutes. 

Since we were able to demonstrate HCV NS3 Protease activity 
using crude cell lysates of pBJ1022 and pNB182A4A, we wanted to at least 
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partially purify the expressed proteins in an effort to remove bacterial 
proteases from these preparations. For this purpose, affinity column 
chromatography using Ni2+ bound ligands was foimd to be effective, 
binding the histidine tag at the amino terminal ends of the expressed 
5 proteins, and subsequently releasing the boimd proteins by imidazole 
elution. The imidazole-eluted fractions resulting from tiie purification 
of pNB182A4A on a Ni-NTA column were tested for activity in the in 
vitro translation assay. The resultant fractions were aU able to cleave the 
translated 5A/5B Substrate, producing the expected 5A and 5B products. 
1 0 Background bacterial protease activity was not detected in these eluted 
fractions • 



As was described above, pBJ1022 was pvuified by another mettiod 
of hri2+ chelate chromatography, using POROS Ni2+ chelate resin and 

1 5 perfusion chromatography. Imidnzole-eluted fractions which were 

positive for immunoreactivity with antibody to NS3 183 were tested for 
HCV protease activity by in vitro translation assay. In order to optimize 
detection of activity in this assay for HCV protease, reactions were 
supplemented with a truncated peptide derived from the NS4A cofactor 

20 which has been shown to enhance cleavage at the 5A/5B site by NS3 
protease . The cofactor was supplied as a synthetic peptide containing 
amino adds 22 to 54 of NS4A (strain HCV-BK) at a final concentration of 
IjiM. All fractions tested were active in this translation assay. 

25 Examplg 6 



ENHANCEMFNT BY 4A PPPTmPQ 

30 1^K4A is able to enhance the NS3 serine protease activity at 

NS5A/5B site in mammalian cells that transiently coexpress NS3, 
N54A, and the various HCV non-structural polyprotein containing 
downstream cleavage sites . We have studied this enhancement activity 
in a well defined cell-free biochemical assay, using the partially purified 

35 E.coK-expressed pBJ1022 as a source of NS3 protease, and synthetic 
peptides containing various truncations of NS4A. In our first 
experiment we used a crude cell lysate of pBJ1022 as the enzyme and 
NS4A synthetic peptide truncated 33 mer from amino add 22 to amino 
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add 54, the carboxy-terminal in vitro translation cleavage reaction . 
The C-terminal 33 amino add peptide of NS4A was able to enhance the 
activity of the NS3 catalytic domain in a dose dependent manner from 
0.01 jiM to 1.0 pM peptide, produdng the expected products of 5A 
5 (12.5kD), and 5B (27kD) from the 40kD translated 5A/5B substrate. 

Without the 4A peptide a relatively low deavage activity by the protease 
alone was ol)served at the short incubation time of 30 minutes. The 4A 
peptide itself or with the combination of crude lysate produced from 
cells harboring the vector plasmid did not deave the substrate. 

10 

To further characterize NS4A enhancement activity additional 
truncations were made to the NS4A sequence. Truncated peptides were 
evaluated for their activity in the in vitro translation assay using Ni2+ 
chelate column-purified pBJ1022 (NS3 catalytic domain). We observed 

15 that in addition to the C-terminal 33 amino add peptide, a 18 amino add 
peptide containing the NS4A sequence from amino add 19 through 36 
was able to enhance the NS3 mediated deavage activity. Other peptides, 
induding the N-terminal 21 amino acid, and two shorter truncations 
from the carboxyl terminal end, a 22mer and a 15mer, were found to 

20 have no effect; also a heterologous peptide of 18 amino add also had no 
enhancement activity. 

Discussion 

25 The experiments described in this report dearly demonstrate that 

bacteriaUy expressed HCV protease catalyzes cleavage of i) HCV 
polyproteins and ii) synthetic peptide substrates in trans biochemical 
assay. The processing activity of NS3 catalytic domain is enhanced by 
N54A and its derivatives. The activity of the fusion protein containing 

30 the N53 catalytic domain and NS4 A is much superior to that of the NS3 
catalytic domain alone. 

Hydrophobidty analysis of the catalytic domain of the NS3 
protease reveals that the protein is very hydrophobic and also it contains 
35 seven cysteine residues. To neutralize hydrophobidty and thus to 
improve solubility we have added six positively charged amino add 
residues as a solubilizing motif. The addition of a solubilizing motif 
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appeal^ to improve the solxability without affecting the enzymatic 
activity. 

We have also shown that the HCV NS4A from Japanese BK 
5 strain has enhanced the HCV-H NS3 mediated cleavage at 5A/5B site. 
This suggests that essential elements of recognition may be cortserved 
among various strains of HCV. 

It is dear from above experimental r^ults that attachment of 
1 0 hydrophilic tail (soliibilizing motif /water attracting structures) at the 
carboxy terminal end of histidine fused NS3 catalytic domain improved 
expression of soluble protein in Exolu In these experiments six residues 
of positively charged amino adds nre attached at the carboxy terminal 
end of the protein. It should be noted that other fusions that contained 
15 six histidine residues, GST (Glutathione S transferase), MB? (Maltose 
binding pix)tein), thioredoxin alone did not show improved solubility of 
NS3. Other examples of solubilizing motif are amphipathic helix tail 
(peptides having charged and hydrophobic amino add residues to form 
both charged and hydrophobic faces). Addition of an amphipathic helix 
20 at the carboxy terminus of such fusion proteins will be an alternative 
way to achieve improvement of solubility without affecting the 
enzymatic activity. 

The hydrophilic tail used in these experiments consists of six 
25 amino adds. The sequence and length of the hydrophilic amino adds 
can be varied to achieve optimal expression of soluble protein. 
Therefore size of the solubilizing motif and nature of charged residues 
may effect the expression of soluble NS3 in Exolu 

30 Position of these water attracting structures/motifs at both ends, 

at one end (amino terminal or carboxy terminal), or insertion within 
the NS3 catalytic domain and NS3 (catalytic domain)-4A fusion protein, 
may improve solubility of the protein without affecting the activity. 



35 



Based on sequence homology to the members of trypsin 
superfamily and the protease of other members of the flaviviruses, it is 
predicted that the amino terminal 181 amino add of NS3 is the catalytic 
domain of HCV NS3 protease. Recently it has also been shown that a 
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protein of 169 amino add containing a 10 amino add deletion from the 
amino termintis and 2 amino add from carboxy terminal of the catalytic 
domain retaiiis full enzymatic activity. The model we have developed 
predicts that a protein of 154 amino acids containing a deletion of 26 
5 amino add from amino terminal and a deletion of 2 amino add from 
the caiboxyl termintis woiild retain full enzymatic activity for the 5A/5B 
substrate. 

Analysis of the amino add sequence of the catalytic domain of 
10 NS3 protease reveals that the protein contains seven cysteine residues, 
an odd number, which may cause aggregation. Mutation of one cysteine 
residue ( located on the surface of the protein molecide and not 
involved in the activity) may improve solubility of the protein without 
affecting the protease activity. 

15 

Using the cell free biochemical assay we have demonstrated that 
the synthetic peptide containing 18 amino add of HCV N54A protein is 
suffident to enhance the deavage at NS5A/5B site mediated by the 
catalytic domain of NS3. 

20 

Example 7 

Refolding of Insoluble HCV Nff^ Protease 

25 

The present example describes a novel process for the refolding of 
HCV NS3 protease which does not have a solubilizing motif from an 
Exoli indusion body pellet. This procedure can be used to generate 
purified enzyme for activity assays and structural studies. 

30 

Extracrion and Purificarion of His>HTV 1R3 from the E.co/i inclusion 

body pellet 

35 £. coli cells harboring the plnsmid for HisHIV183 was used to 

transform a culture of £. coli strain M15 [pREP4] (Qiagen), which over- 
expresses the lac repressor^ according to methods recommended by 
commerdal source. M15 [pREP4] bacteria harboring recombinant 
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plasmids were grown overnight in 20-10-5 broth supplemented with 
lOO^g/ml ampidllin and 25^g/ml kanamydn. Cultures were diluted to 

0. D.600 of 0.1, then grown at 37^ to O.D.600 of 0.6 to 0.8, after which 
IPTG was added to a final concentration of ImM At post-induction 2 to 

5 3 hours, tiie cells were harvested by pelleting, and the cell pellets were 
washed witfi lOOmM Tris, pH 7.5. were pelleted by centrifugation. The 
cell pellet was resuspended in 10 ml of O.IM Tris-HQ, 5mM EDTA, pH 
8.0 (Buffer A) for each gm wet weight of pellet. The pellet was 
homogenized and resuspended using a Douhce homogenizes The 
1 0 suspension was clarified by centrifugation at 20,000 x g for 30 minutes at 
4**C The pellet was sequentially washed with the following five buffers: 

1. Buffer A 

1 5 2. 1.0M sodium chloride (NaQ) in buffer A 
3. 1.0% Triton X-100 in buffer A 
4. Buffer A 

5. 1.0 M Guanidine HQ ( GuHQ) in buffer A 

20 The washed pellet was solubilized with 5M GuHQ, 1% beta 

mercaptoethanol in buffer A (3 ml per gm wet wt. of pellet) 
using a Dotmce homogeiuzer and centrifuged at 100,000 x g for 30 
minutes at 4°C. Purification of denatured HisHIV183 from high 
molecular weight aggregates was accomplished by size exclusion on a 

25 SEPHACRYL S-300 gel filtration column. 

In particular, an 8 ml sample of the 5.0M GuHQ E. coli extract 
was applied to a 160 ml Pharmacia S-300 colimm (1.6 x 100 cm) at a flow 
rate of 1.0 ml/min. The column buffer was comprised of 5.0 M GuHQ, 
30 0.1 M Tris-HQ, pH 8.0, and 5.0 mM EDTA The fraction size was 5.0 mL 
Appropriate fractions were pooled based on the results of SDS-PAGE, as 
well as N-terminal sequence analysis of the protein transferred to a Pro- 
Blot. 

35 Detergent-assisted refoldinp of HCV-pmtPAcp 

The protein was concentrated by ultrafiltration using a 43 mm 
Amicon YMIO membrane to 1.0 mg per ml in 5M GuHQ, O.IM Tris-HQ 
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pH 8.0, 1.0 mM EDTA, 1.0% beta-mercaptoethanoL It was then diluted 
50-fold to O.IM GuHQ in refolding buffer (100 mM sodium phosphate 
pH 8.0, lOmM DTT, 0.1% lauryl maltoside) and the mixture was 
incubated on ice for at least one hour. A 25 ml sample containing 500 \ig 
5 of the protein in the refolding buffer was applied to a Pro-RFC HR 3/5 
reversed phase chromatography column. The applied sample contaiited 
500 ^g protein in 25 ml of refolding buffer. To the column was then 
applied a solution B comprised of 99.9% H2O + 0.1% trifluoroacetic add 
(TFA). A 10 ml volume of solution C [10% H20, 90% acetonitrile (AcN) 
10 + 0.1% TFA] was applied to the column at a 0 - 60% gradient into 

solution B at a flow rate of 0.5ml/min. and a fraction size of 0.5ml. The 
fractions were monitored at A214; 2.0 absorbance units full scale (AUFS). 

Fractions containing the protein (corresponding to peak 1) were 
1 5 pooled for renaturation by stepwise dialysis. The fractions were first 
dialysed in 0.1% TFA in 25% glycerol overnight at 4**C ; then dialyzed in 
0.01% TFA in 25% glycerol overnight at 4**^ then dialyzed in 0.001% 
TFA in 25% glycerol for 3.0 hours; then dialyzed for 3 hours at 4^C in 50 
mM NaP04, pH 6.0, 10 mM dithiotreitol PTT) in 25% glycerol The 
20 protein was then dialyzed for 3.0 hours at 4**C in 50 mM NaP04, pH 7.0, 
0.15 M NaQ, 10 mM DTT in 25% glycerol; and then finally dialyzed in 50 
mM NaP04, pH 7.8, 0.3 M NaCl, 10 mM DTT, 0.2% Tween 20 in 25% 
glycerol. This resulted in purified, refolded, soluble, active HCV NS3 
protease. 

25 

Far UV circular dichroism (CD) analysis of the protein was used 
to monitor the refolding from an acid denatured state to a folded state at 
neutral pH. The protein recovery was monitored by a UV scan and SDS- 
PAGE analysis. 

30 

Detergent-assisted Refolding of His-HTVIfi'^ 

35 HisHIV183 was quantitatively extracted from an E. coli inclusion 

body pellet. SDS-PAGE analysis at the various stages of extraction shows 
that sequential washes are essential to remove significant amounts of 
the contaminating proteins. HisHIV183 was extracted from the washed 



wo 96/35717 



PCT/DS96/0d389 



•28- 

indusion body pellet in the presence of 5M GuHCL The 5M GuHQ 
extract was applied to a SEPHACRYL S-300 column and the appropriate 
fractions were pooled based on SDS-PAGE analysis. The amino add 
sequence of the first ten residues was verified. 

5 

Refolding was performed at very low concentrations of protein, 
in the presence of DTT, lauryl maltoside and glycerol at The diluted 
protein was concentrated on a Pro-RPC reversed phase column. Two 
peaks were obtained based on the UV and protein profile. Only Peak 1 

10 has yielded soluble protein after stepwise dialysis. Far UV CD spectral 
analysis was used to monitor refolding from a denatured state at add pH 
to a folded state at neutral pH. At pH 7 A, the protein was fbimd to 
exhibit significant amounts of seamdary structure that is consistent with 
that of beta sheet protein. At low pH, the CD spectrum showed that it is 

15 fully random coil, having a minimal molar elliptidty at 200nm. The 
ratio of this minimum at 200nm to that of the shoulder at 220 nm is 
approximately 4:1. This ratio decreased when the secondary structure 
formation occurred at neutral pH. 

20 A UV scan at each step of dialysis showed that the protein 

recovery was >90% up to pH 7.4 and that there was no light scattering 
effect due to protein aggregates. SDS-PAGE analysis also indicated that 
there was no loss of protein up to pH 7.0 during refolding. Predpitation 
of protein occurred at the last step of dialysis, and the soluble protein 

25 was darified by centrifugation. The overall protein recovery was about 
0.10%. The refolded protein was found to be active in a trans-cleavage 
assay vising the in vitro-translated 5A/5B substrate in the presence of 4A 
peptide as described in the next example. 



30 



35 



Examplg 8 

Analysis of Refoldpri NIR-^ PmteasP Artivity hy 
In Viiro TrnnshHnn Assay 

■ 

To detect HCV NS3 protease activity in trans, we have 
expr^sed a 40 kD protein containing the NSSA/5B deavage site in cell- 
free translation system and used that as the substrate for the enzyme. 
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The substrate protein produces two protein products of apparent 
molecular weight 12-5 kD (N5 5A') and 27 kD (NS5B') upon cleavage by 
the HCV N53 protease. 

5 The plasmid pTS102 encoding the substrate 5A/5B was 

linearized by digestion with EcoR I and was transcribed using T7 RNA 
polymerase in vitro. The RNA was translated in presence of 
methionine in rabbit reticulocyte lysates according to the manufocturer's 
(Promega ) protocol to produce HCV specific protein. In a 20 ^ total 

1 0 reaction mixture containing lOmM Tris, pH 7.5, ImM DTT, 0.5mM 
EDTA, and 10% glycerol was placed 2 to 8 nl of 35s methionine-labeled 
translated 5A/5B substrate. The reaction was started with the addition 
of 10^1 of HCV NS3 protease (SEQ ID NO: 5) with an approximately 
equimolar amotmt (2 \xM) of the carboxyterminal 33 mer cofactor N54A 

1 5 (SEQ ID Na 29) in solubilization buffer (50mM Na Phosphate, pH 7.8, 
0.3M NaQ, 0.2% Tween 20, 10 mM DTT or BME, 10% glycerol), and 
incubated at 30X for about one hour. Reactions were stopped by adding 
an equal volume of 2X Laemmli sample buffer (Enprotech Inc.) and 
heating at 100 C for 3 minutes. Reaction products were separated by SDS 

20 PAGE electrophoresis; gels were fixed, dried and subjected to 
autoradiography. 

The assay was able to deave 5A/5B substrate in a 
dose responsive manner, producing the expected cleaved products: 5A 
25 (12.5 kD) and SB (27 kD) as shown by SDS PAGE analysis. The 

production of cleaved 5A and 5B polypeptides from the 5A/5B substrate 
is proof that soluble, active, refolded HCV protease was indeed produced 
by the process of example 7. 

30 

Example 9 

Surface Plasmon Rpsnnance Assay 

35 



The present example illustrates a method for determining if a 
compoimd can be useful as an HCV protease inhibitor using the surface 
plasmon resonance assay. Figures 8A and 8B illustrate the technique. 
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BIAcore™ is a processing unit for Biospedfic Interaction 
Analysis. The processing tinit integrates an optical detection system 
with an autosaxnpler and a xnicrofluidic system. BIAcore™ tises the 
5 optical phenomena, surface plasmon resonance to monitor interaction 
between biomolecules. SPR is a resonance phenomenon between 
incoming photons and electrons on the surface of thin metal Him. 
Resonance occurs at a sharply defined angle of incident light At this 
angle, called the resonance angle, energy is transferred to the electrons 

10 in the metal film, resulting in a decreased intensity of the reflected light 
SPR response depends on a change in refractive index in ttie dose 
vicinity of the sensor chip surface, and is proportional to the mass of 
analyte bound to the surface. BIAcore continuously measure the 
resonance angle by a relative scale of resonance units (RU) and displays 

15 it as an SPR signal in a sensorgram, where RU are plotted as a fuiurtion 
of time. 

In addition, BIAcore'™ uses continuous flow technology. One 
interactant is immobilized irreversibly on the sensor chip, comprising a 

20 non-crosslinked carboxymethylated dextran providing a hydrophilic 
environment for bimolecular interaction. Solution containing the 
other interactant flow continuously over the sensor chip siuf ace. As 
molecules from the solution bind to the immobilized ligand, the 
resonance angle changes resulting in a signal registered by the 

25 instrument. 



In tills methodology, the enzymatic reactions are carried out 
outside of the BIAcore, i.e. in reaction tubes or 96-well tissue culture 
plates, as it is conventionally done for any of the currently available 
30 high throughput assays. The SPR is only used as a detection means for 
determination of the amount of an intact substrate remaining in a 
solution with and without the enzyme after the reaction is quenched. 

In order to measure the amount of the intact substrate prior to the 
35 addition of enzyme, a means of capturing the substrate onto the sensor 
chip had to be established. In addition, to satisfy the requirement for a 
high tiuxmghput assay on tiie BIAcore, the substrate needed to be 
removed from the surface subsequent to completion of aimlysis. This is 
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required since the same surface will be used for the subsequent 
reactions. To accomplish these two requirements, a phosphotyrosine is 
synthetically attached to one end of the substrate. The phosphotyrosine 
was chosen due to the commercial availability of an anti- 
5 phosphotjrrosine monoclonal antibody. The antibody is covalently 
attached to the sensor chip by standard amine coupling chemistry. The 
anti-phosphotyrosine antibody, bound permanently to the chip is used 
to capture the phosphotyrosine-containing substrate in a reversible 
manner. The antibody-phosphotyrosine interaction is ultimately tised 
10 to capture and release the peptide substrate when desired by 

regeneration of the surface with various reagents Le. 2 M Mga2. 

Introduction of the intact peptide onto ttie antibody surface 
results in a larger mass which is detected by the instrument To fbUow 

15 the extent of peptide deavage, a mixture of peptide substrate and 
enzyme is incubated for the desired time and then quenched. 
Introduction of this mixture containing the cleaved peptide and the 
intact peptide to a regenerated antibody surface results in a lower mass 
value than that detected for a sample containing only intact peptide. 

20 The difference in the two values is then used to calculate the exact 
amoimt of intact peptide remaining after cleavage by the enzyme. 

Although the reduction in mass can be directly followed with 
many large substrates, due to the small mass of a typical synthetic 

25 peptide substrate (10-20 amino acids, 1-3 Daltons), the mass difference, 
and thus the signal difference between the intact and cleaved peptide is 
very small within the signal to noise ratio of the instrument. To 
circumvent this low sensitivity, we attached a biotin at the N-terminus 
of the peptide. By addition and thus tagging of peptide with streptavidin 

30 prior to iiqection of tagged peptide onto the antibody surface of the chip, 
the signal due to the presence of streptavidin will be higher. Using this 
approach, a cleaved peptide lacking the N-terminal half, tagged witii 
streptavidin will result in a much lower signal. 

35 The HCV protease 5A-5B peptide substrate, 

DTEDWACSMSYTWTGK (SEQ ID NO 18) was synthesized with an 
additional phosphotyrosine at the C-terminus and biotin at the N- 
terminus. The biotin was then tagged with streptavidin. An anti- 
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phosphotyrosine monodonal antibody, 4G10 (Upstate Biotechnology 
Inc., Lake Placid, New York) was coupled to the sensor chip. In the 
absence of HCV profease, the intact, streptavidin-tagged biotinylated 
phosphot]rrosine peptide results in a large signal (large mass imit/large 
5 signal) through its interaction with the anti-phosphotyrosine 
monoclonal antibody (Mab). 

The protease-catalyzed hydrolysis of the phosphotyrosine- 
biotinylated peptide was carried out in a 96 well plate. The reaction was 
10 stopped with an equal volume of mercuribenzoate. The cleaved peptide 
which lacks the tagged streptavidin (less mass) results in the loss of 
response units (lower signal). 

Using this method, numerous compounds can be tested for th&i 
1 5 inhibitory activity since tfie antibody surface can be regenerated 
repetitively with 2 M Mga2. 

ftPCedMre for roHPling AnH-nhnsphntvmrinP M ab ♦n thP (^n^nr fTiip 

20 The anti-phosphotyrosine Mab is coupled to the 

carl>oxymethylated dextran surface of a sensor chip in the following 
maimer. Hie flow rate used throughout the coupling procedure is 5 
Ml/miit The surface is first activated with a 35 jil injection of NHS/EDC 
(N-hydroxysuccinimide/N-dimethyllaminopropyl-N'- 

25 ethylcarbodiimide-Ha). This is followed by a 40 ml injection of Mab 
4G10 at 50 jig/ml in 10 mM sodium acetate buffer, pH=4.0. Any 
remaining activated esters are then blocked by the injection of 35 jil of 
1 M ethanolamine. These conditions result in the immobilization of 
approximately 7^00 response units (420 (iM) of antibody. 

30 



The flow rate used throughout the BIAcore analysis run is 5 
jil/min. A 4 nl injection containing streptavidin-tagged peptide 
35 (peptide concentration at 2^M, streptavidin binding sites concentration 
at 9nM) is carried out The amount of sfareptavidin-tagged peptide 
boxmd to the antibody surface (in response units) is measured 30 seconds 
after the injection is complete. 



wo 96/35717 



PCrAJS96/06389 



-33- 



Regeneratinn nf sensor chip ftx^ri^t^ 

Regeneration of the Mab 4G10 surface is achieved using a 4 ^1 
5 pulse of 2 M Mga2 after each peptide iiqection. Surfaces regenerated up 
to 500 times still showed 100% binding of tagged peptide. 

Determination nf the Optimal Poncgnh-aHnn of Peptide anH 
Streptavidin 

10 

To determine the optimal peptide concentration, a standard curve 
was generated using various amounts of peptide (0-10 ^M) in the 
presence of excess streptavidin. A value in the linear range, 2 was 
chosen for standard assay conditions. 

15 

The amount of streptavidin required to completely tag the 
peptide was determined using a peptide concentration of 25 isM and 
titrating the amount of streptavidin (^M of binding sites). AU the 
peptides were shown to be completely tagged when streptavidin 
concentrations greater than 3 jiM (approximately equimolar to the 
peptide concentration) were used. A streptavidin concentration of 9 
pM (a 4.5 fold excess) was chosen for standard assay conditions. 



Application of Described Methodnlngy tn W CV Protease 

The HCV protease 5A/5B peptide substrate, 
DTEDWACSMSYTWTGK (SEQ ID NO 18), with phophotyrosine at the 
C-terminal and biotin at the N-tenninal is synthesized. Anti- 
phosphotyrosine monoclonal antibody, 4G10 was coupled to the sensor 
chip. 

In the absence of HCV protease, the intact streptavidin-tagged 
biotinylated phosphotyrosine peptide results in a large signal (large mass 
unit/large response units) through its interaction with the anti- 
phosphotyrosine monoclonal antibody. 
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The protease-catalyzed hydrolysis of the phosphotyrosine- 
biotinylated peptide was carried out in a 96 well plate. The reaction was 
stopped with an equal volume of the quenching buffer containing 
mercuribenzoate. Streptavidin was added to tag the peptide which binds 
5 to the biotin. The cleaved peptide which lacks the tagged streptavidin 
(less mass) results in the loss of response units. 

Using this assay, niunerous compotmds can be tested for their 
inhibitory activity since the antibody surface can be regenerated 
1 0 repetitively with 2 M Mga2. 

The peptide cleavage activity by HCV protease can be monitored 
in a time dependent maimer using the BIAcore-based methodology- 
Using the concentrated enzyme and the BIAcore siibstrate, 

1 5 BiQtin-DTEDWAC SMSYTVVTGK-pY (SEQ ID NO 17), 50% substrate 
deavage is achieved wittun 1 hour using the BIAcore-based HCV assay. 
Based on the amount of enzyme, His-NS3(183)A4AHT needed to reach a 
50% deavage within 2 hours, a time scale desired for a development of a 
high throughput assay, we estimate that 1 liter of fermentation of the 

20 His-NS3(183)A4AHT construct results in enough protease to run at least 
100 reactions on the BIAcore. 



Standard Operating Prorpdurp fnr mAc or^basgd Hrv A^^^y 

25 Reactions are prepared in a 96-well tissue culture plate using the 

Reaction Buffer (50 mM HEPES, pH 7.4, 20 % glycerol, 150 mM NaQ, 
ImM EDTA, 0.1% Tween-20,1 mM DTT ) as diluent. The final reaction 
volume is 100 ^il. Sample with the peptide alone (Biotin-DTEDWAC 
SMSYTWTGKpY) is prepared by addition of 10 jil of peptide stock at 100 

30 nM (prepared in the reaction buffer) to 90 jil of reaction buffer, so that 
the final concentration of peptide is 10 ^iM. Samples comprised of 
peptide and the enzyme are prepared by addition of 10 ^il of peptide 
stock at 100 (iM and 10 jil of partially purified His-NS3 (183)-A4A-HT 
stock at 1.7mg/ml (both prepared in the reaction buffer) to 80 ^l of 

35 reaction buffer, so that the final concentration of peptide and the 

enzyme is 10 and 0.1 pM respectively. The reaction is held at SO'X: for 
the specified time and then quenched. Quenching is achieved by 
transferring a 20-nl aliquot of the reaction mixture to a new tissue 



wo 96/35717 



PCTAJS96/06^ 



-35- 

culture plate containing an equal volume of PMB Quenching Buffer (50 
mM HEPES, pH 7 A, 150 mM NaCl, 5 mM P-Hydroxymercuribenzoic 
Add, and 13 mM EDTA). 



10 



To prepare the quenched reaction mixture for injection onto the 
sensor surface, 30 ^1 PMB BIAcore Buffer (50 mM HEPES, pH 7 A, 1 M 
NaQ) and 30 pi of streptavidin at 05 mg/ml in water is added to the 40 
pi of the quenched reaction mixture to a final voltune of 100 pi. In this 
step, all the peptides are tagged with streptavidin prior to the injection 
of samples. Hnally, 4 pi of this sample is injected over ttie 
antiphosphotjnrosin surface for determination of the intact versus 
deaved peptide. The final concentration of peptide and the streptavidin 
in the BIAcore sample is 2 and 9 pM resp>ectively* 



15 Experimental Conditions: 



20 



Substrate: 
ID 

ConcentraHon: 



l-DTEDWAC SMSYTWTGK-pX (SEQ 
NO 19) in Reaction buffer without DTP 

170 pM (Crude peptide, based on weight) 



25 



Enzyme: 



Reaction volume: 



10 pi of concentrated His-NS3 (183)-A4A-HT 
at 1.7 mg/ml 

100 pi 



30 



Reaction buffer: 



50 mM HEPES, pH 7.8 
20 % glycerol 
150 mM NaQ 
ImM EDTA 
ImM DTT 
0.1% Tween-20 



35 Temp : 



30° C 



Quench with: 



p-hydroxymercuribenzoate 
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SEQUENCE USnNG 



5 

(1) GENERAL INFORMATION: 

(i) APPUCANT: Schering Corporation 

10 

(u) TTTLE OF INVENTION: Soluble, Cleavabe Substrates of the 
Hepatitis C Protease 

(iii) NUMBER OF SEQUENCES: 31 

15 

. (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Schering Corp. 

(B) STREET: 2000 Galloping Hill Road 
(Q CITY: Kenilworth 

20 (D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07033-0530 

(v) COMPUTER READABLE FORM: 
25 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: Apple Macintosh 
(Q OPERATING SYSTEM: Macintosh 7.1 
P) SOFTWARE- Microsoft Word 5.1a 

30 (vi) CURRENT APPUCATION DATA: 

(A) APPUCATION NUMBER: 

(B) FILING DATE: 

(C) CLASSMCATION: 

35 (vii) PRIOR APPLICATION DATA: 

(A) APPUCATION NUMBER: 08/439,747 

(B) FILING DATE: 12-MAY-1995 

(viii) ATTORNEY/ AGENT INFORMATION: 
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(A) NAME Luim, Paul G. 

(B) REGETRAHON number 32,743 

(C) REFERENCE/DOCKET NUMBER JB0509PCT 

5 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 908-298-5061 

(B) TELEFAX: 908-298-5388 

1 0 (2) INFORMATION FOR SEQ ID NQl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 base pairs 

(B) TYPE: nudeic add 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(u) MOLECULE TYPE: cDNA 

20 (ix) FEATURE: 

(A) NAME/KEY: HCV NS3 Protease 



25 GCG CCC ATC ACG GCG TAC GCC CAQ CAG ACG A6A GGC CTC CTA GGG 45 
Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 
15 10 15 

TC?r ATA ATC ACC AGC CTS ACT GGC CGG GAC AAA AAC CAA GTC GAG 90 
30 Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 

20 25 30 

COT GAG GTC CAG ATC GTG TCA ACT OCT ACC CAA ACC TTC CTC GCA 135 
Gly Glu Val Gin lie Val Ser Thr Ala Ihr Gin Thr Phe Leu Ala 
35 35 40 45 



wo 9d05717 PCTA7S96/06389 

-38- 

ACG TBC ATC AAT GGG GTA TGC TOG ACT GTC TAC CAC GGG GCC GGA 180 
Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly 

50 55 60 



ACG AGG ACC ATC GCA TCA CCC AAG GOT CCT GTC ATC GAG ATC TAT 225 
Thr Arg Tbr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr 

65 70 75 

10 ACC AAT GTO GAC CAA GAC CTT GTC GGC TGG CCC GCT CCT CAA GGT 270 
thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 

80 85 90 



15 



TCC CGC TCA TTC ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC 315 
Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp I*eu Tyr 

95 100 105 



CTC GTT ACG AGG CAC GCC GAC GTC ATT CCC GTC CGC CGG CGA GGT 360 
Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly 
20 110 115 120 

GAT AGC AGG GGT AGC CTC CTT TCG CCC COG CCC ATT TCC TAC CTA 405 
Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyx Leu 

125 130 135 

25 

AAA GGC TCC TCG GGG GGT CCG CTC TTC TGC CCC GCG GGA CAC GCC 450 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala 

140 145 150 

30 GTC GGC CTA TTC AGG GCC GCG GTC TCC ACC CGT GGA GTC ACC AAG 495 
Val Gly Leu Phe Arg Ala Ala Val Cys "nir Arg Gly Val Thr Lys 

155 160 165 



35 



GCG GTC GAC TTT ATC CCT GTC GAG AAC CTA GAG ACA ACC ATC AGA 540 
Ala Val Asp Phe He Pro Val Glu Asn Leu Glu Thr Thr Met Arg 

170 175 180 
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TCC CCG GTG 
Ser Pro Val 

(2) INFORMATION FOR SEQ ID NO:2: 

5 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino adds 

(B) TYPE- anuno add 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: peptide 

Arg Lys Lys Lys Arg Arg 

15 

(2) INFORMATION FOR SEQ ID N03: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH 567 base pairs 
20 (B) TYPE: nudeic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(ix) FEATURE: 
(A) NAME/KEY: 

OCG CCC ATC ACG GCG TAC GCC CAG CAG ACG AGA GGC CTC CTA GGG 45 
30 Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 
15 10 15 



TOT ATA ATC ACC AGC CTG ACT GGC CGG GAC AAA AAC CAA GTG GAG 90 
Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 

20 25 30 
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GGT GAG GTC CAG ATC GTG TCA ACT GCT ACC CAA ACC TTC CTG OCA 135 
Gly Glu Val Gin lie Val Ser T)ur Ala Thr Gin Thr Phe Leu Ala 

35 40 45 



ACG TGC ATC AAT GGG GTA TGC TGG ACT GTC TAC CAC GGG GCC GGA 180 
Thr Cys lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly 

50 55 60 



ACG AGG ACC ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATG TAT 225 
10 Thr Arg Thr lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr 

65 70 75 

ACC AAT GTG GAC CAA GAC CTT GTG GGC TGG CCC GCT CCT CAA GGT 270 
Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
15 80 85 90 



20 



TCC OGC TCA TTG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC 315 
Ser Arg Ser Leu Thr Pro Cys Thx Cys Gly Ser Ser Asp Leu Tyr 

95 100 105 

CTG GTT ACG AGG CAC GCC GAC GTC ATT CCC GTG CGC CGG CGA GGT 360 
Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly 

110 115 120 



25 GAT AGC AGG GGT 
Asp Ser Arg Gly 

AAA GGC TCC TCG 
30 Lys Gly Ser Ser 

GTG GGC CTA TTC 
Val Gly Leu Phe 

35 



AGC CTG CTT TCG CCC CGG 
Ser Leu Leu Ser Pro Arg 
125 130 

GGG GGT CCG CTG TTG TGC 
Gly Gly Pro Leu Leu Cys 
140 145 

AGG GCC GCG GTG TGC ACC 
Arg Ala Ala Val Cys Thr 
155 160 



CCC ATT TCC TAC CTA 405 
Pro lie Ser Tyr Leu 

135 

CCC GCG GGA CAC GCC 450 
Pro Ala Gly His Ala 

150 

CGT GGA GTG ACC AAG 495 
Arg Gly Val Thr Lys 

165 



wo 96/35717 



41 



PCrA7S96/06389 



GCG GTG GAC TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA 540 
Ala Val Asp Phe lie Pro Val Glu Asn Leu Glu Ifhr Thr Met Arg 

170 175 180 

5 TCC CCG GTG AGA AAG AAG AAG AGA AGA 
Ser Pro Val Arg Lys Lys Lys Arg Arg 



(2) INFORMATION FOR SEQ ID NO:4: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 603 base pairs 

(B) TYPE: nudeic add 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: dDNA 

(ix) FEATURE: 
20 (A) NAME/KEY: pBJ1022(His/NS3 (182)/HX 



ATG AGA GGA TCG 
Met Arg Gly Ser 
25 1 

ACX3 GCG TAG GCC 
Thr Ala lyr Ala 

30 

ACC AGC CTG ACT 
Thr Ser Leu Thr 

35 CAG ATC GTG TCA 
Gin lie Val Ser 



CAT CAC CAT CAC CAT CAC 
His His His His His His 
5 10 

CAG CAG ACG AGA GGC CTC 
Gin Gin Thr Arg Gly Leu 
20 25 

GGC CGG GAC AAA AAC CAA 
Gly Arg Asp Lys Asn Gin 
35 40 

ACT GCT ACC CAA ACC TTC 
Thr Ala Thr Gin Thr Phe 
50 55 



ACG GAT CCG CCC ATC 45 
Thr Asp Pro Pro lie 

15 

CTA GGG TGT ATA ATC 90 
Leu Gly Cys lie He 

30 

GTG GAG GGT GAG GTC 135 
Val Glu Gly Glu . Val 

45 

CTG GCA ACG TGC ATC 180 
Leu Ala Thr Cys He 

60 



wo 96/35717 

AAT GGG GTA TCC TOG ACT GTC TAC 
Asn Gly Val Cys Trp Thr Val Tyr 

€5 

5 

ATC GCA TCA CCC AAG GOT CCT GTC 
lie Ala Ser Pro Lys Gly Pro Val 

80 

10 GAC CAA QAC CTT GTC GGC TGG CCC 
Asp Gin Asp Leu Val Gly Trp Pro 

95 



PCTAJS96/D6389 

42- 

CAC GGG GCC GGA ACG AGG ACC 225 
His Gly Ala Gly Thr Arg Thr 
70 75 

ATC CAG ATC TAT ACC AAT GTC 270 
lie Gin Met Tyr Thr Asn Val 
85 90 

GCT CCT CAA GGT TCC CGC TCA 315 
Ala Pro Gin Gly Ser Arg Ser 
100 105 



15 TTO ACA CCC TCC 
Leu Thr Pro Cys 

AGG CAC GCC GAC 
20 Arg His Ala Asp 



GGT AGC CTC CTT 
25 Gly Ser Leu Leu 



TCG GGG GGT CCG 
30 Ser Gly Gly Pro 



ACC TGC GGC TCC TCG GAC 
Thr Cys Gly Ser Ser Asp 
110 115 

GTC ATT CCC GTC CGC CGG 
Val lie Pro Val Arg Arg 
125 130 

TCG CCC CGG CCC ATT TCC 
Ser Pro Arg Pro lie Ser 
140 145 

CTC TTC TGC CCC GCG GGA 
Leu Leu Cys Pro Ala Gly 
155 160 



CTT TAC CTC GTT ACG 360 
Leu lyr Leu Val Thr 

120 

CGA GGT GAT AGC AGG 405 
Arg Gly Asp Ser Arg 

135 

TAC CTA AAA GGC TCC 450 
lyr Leu Lys Gly Ser 

150 

CAC GCC GTC GGC CTA 495 
His Ala Val Gly Leu 

165 



35 



TTC AGG GCC GCG GTC TGC ACC CGT GGA GTC ACC AAG GCG GTC GAC 540 
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys Ala Val Asp 

170 175 180 



wo 96/35717 
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TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA TCC CCG GTC 585 
Phe lie Pro Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Val 

185 190 195 

5 AGA AAG AAG AAG AGA AGA 
Arg Lys Lys Lys Arg Arg 



10 

(2) INFORMATION FOR SEQ ID NaS: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 630 base pairs 
1 5 (B) TYPE: nucleic add 

(C) STRANDEDNESS: single 
p) TOPOLOGY: linear 

(u) MOLECULE TYPE: cDNA 

20 

(ix) FEATURE: 

(A) NAME/KEY: pT5His/HIV/183 No solubilizing motif 

ATG AGA GGA TCG CAT CAC CAT CAC CAT CAC GGA TCC CAT AAG GCA 45 
25 Met Arg Gly Ser His His His His His His Gly Ser His Lys Ala 
15 10 15 



AGA GTT TTG GCT GAA GCA ATG AGC CAT GGT ACC ATG GCG CCC ATC 
Arg Val Leu Ala Glu Ala Met Ser His Gly Thr Met Ala Pro lie 
30 20 25 30 



90 



35 



ACG GCG TAC CCC CAG CAG ACG AGA GGC CTC CTA GGG TCT ATA ATC 135 
Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Cys He He 

35 40 45 

ACC AGC CTG ACT GGC CGG GAC AAA AAC CAA GTG GAG GGT GAG GTC 180 
Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 

50 55 60 
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5 



10 



15 



20 



30 



CAG ATC CTO TCA ACT OCT ACC CAA ACC TPC CTG GCA ACG TCC ATC 225 
Gin Zle Val Ser Thr Ala Uir Gin Thr Phe Leu Ala Thr Cys lie 

65 70 75 

AAT GCG GTA TGC TCG ACT GTC TAC CAC GGG GCC GGA ACG AGG ACC 270 
Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 

80 85 90 



ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATC TAT ACC AAT GTC 315 
lie Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val 

55 100 105 

GAC CAA GAC CTT GTC GGC TGG CCC GCT CCT CAA GGT TCC CGC TCA 360 
Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser 

110 115 120 

TTC ACA CCC TCC ACC TGC GGC TCC TCG GAC CTT TAC CTC GTT ACG 405 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Lieu T/r Leu Val Thr 

125 130 135 



AGG CAC GCC GAC GTC ATT CCC GTC CGC CGG CGA GGT GAT AGC AGG 450 
Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg 
25 140 145 150 

GGT AGC CTC CTT TCG CCC CGG CCC ATT TCC TAC CTA AAA GGC TCC 495 
Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 

155 160 165 



TCG GGG GGT CCG CTC TTC TGC CCC GCG GGA CAC GCC GTC GGC CTA 540 
Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu 

170 175 180 



35 



TTC AGG GCC GCG GTC TGC ACC CGT GGA GTC ACC AAG GCG GTC GAC 
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys Ala Val Asp 

185 190 195 



585 
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TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA TCC CCG GTC 630 
Phe He Pro Val Glu Asn Leu Glu Thr Thr Met Arg Sex Pro Val 

200 205 210 

5 

(2) INFORMATION FOR SEQ ID NO:6: 

0) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 162 base pairs 
1 0 (B) TYPE: nuddc add 

(C) STRANDEDNESS: single 
P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(ix) FEATURE: 

(A) NAME/KEY: NS4A 



20 



AGC ACC TGG GTG CTC GTT GGC GGC GTC CTC GCT GCT CTC GCC GCG 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
15 10 15 



45 



25 



TAT TGC CTG TCA ACA GGC TGC GTG GTC ATA GTC GGC AGG ATT GTC 90 
lyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg He Val 

20 25 30 



30 



TTG TCC GGG AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC 135 
Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr 

35 40 45 

CAG GAG TTC GAT GAG ATC GAA GAG TGC 162 
Gin Glu Phe Asp Glu Met Glu Glu Cys 

50 



35 (2) INFORMATION FOR SEQ ID NO:7: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 702 base pairs 
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(B) TYPE: nudeic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: d)NA 

(ix) FEATURE: 

(A) NAME/KEY: NS3 +NS4A 

10 GCG CCC ATC ACG GCG TAG GCC CAG GAG ACG AGA GGC CTC CTA GGG 45 
Ala Pro He Thr Ala Oyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 
^5 10 15 



15 



TGT ATA ATC ACC AGC CTG ACT GGC CGG GAC AAA AAC CAA GTC GAG 90 
Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 

20 25 30 



OGT GAG GTC CAG ATC GTG TCA ACT GCT ACC CAA ACC TTC CTG GCA 135 
Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala 
20 35 40 45 

ACG TGC ATC AAT GGG GTA TGC TGG ACT GTC TAC CAC GGG GCC GGA 180 
Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly 

50 55 60 

25 

ACG AGG ACC ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATG TAT 225 
Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met lyr 

65 70 75 

30 ACC AAT GTG GAC CAA GAC CTT GTG GGC TGG CCC GCT CCT CAA GGT 270 
Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 

80 85 ' 90 



35 



TCC CGC TCA TTG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC 315 
Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Oyr 

95 100 105 
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CTC GTT ACG AGG CAC GCC GAC GTC ATT CCC GTG CGC CGG CGA GGT 360 
Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly 

110 115 120 



5 GAT AGC AGG GOT AGC CTG CTT TCG CCC CGG CCC ATT TCC TAC CTA 405 
Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu 

125 130 135 

AAA GGC TCC TCG GGG GGT CCG CTG TTG TGC CCC GCG GGA CAC GCC 450 
10 Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala 

140 145 150 

GTC GGC CTA TTC AGG GCC GCG GTG TGC ACC CGT GGA GTC ACC AAG 495 
Val Gly Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys 
15 155 160 165 

GCG GTG GAC TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATC AGA 540 
Ala Val Asp Phe He Pro Val Glu Asn Leu Glu Thr Thr Met Arg 

170 175 180 

20 

TCC CCG GGG GTG CTC GTT GGC GGC GTC CTC GCT GCT CTC GCC GCG 585 
Ser Pro Gly Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 

185 190 195 

25 TAT TGC CTG TCA ACA GGC TGC GTC GTC ATA GTC GGC AGG ATT GTC 630 
Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg He Val 

200 205 210 

TTC TCC GGG AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC 675 
30 Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr 

215 220 225 

CAG GAG TTC GAT GAG ATC GAA GAG TCC 702 
Gin Glu Phe Asp Glu Met Glu Glu Cys 
35 230 



(2) INFORMATION FOR SEQ ID NO:8: 



wo 96/35717 



-48- 
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0) SEQUENCE CHARACTERBnCS: 

(A) LENGTH: 855 base pairs 

(B) TYPE: nudeic add 

5 (C) STEIANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 (ix) FEATURE: 

(A) NAME/KEY: pNB182A4AHr 



ATG AGA GGA TCG CAT CAC CAT CAC CAT CAC GGA TCC CAT AAG GCA 45 
15 Met Arg Gly Ser His His His His His His Gly Ser His Lys Ala 
15 10 15 



20 



AGA GTT TTG GCT GAA GCA ATG AGC CAT GGT ACC ATC GCG CCC ATC 
Arg Val Leu Ala Glu Ala Met Ser His Gly Thr Met Ala Pro He 

20 25 30 



90 



ACG GCG TAC GCC CAG CAG ACG AGA 
Thr Ala Tyr Ala Gin Gin Thr Arg 

35 

25 

ACC AGC CTG ACT GGC CGG GAC AAA 
Thr Ser Leu Thr Gly Arg Asp Lys 

50 



GGC CTC CTA GGG TCT ATA ATC 135 
Gly Leu Leu Gly Cys lie He 
40 45 

AAC CAA GTG GAG GGT GAG GTC 180 
Asn Gin Val Glu Gly Glu Val 
55 60 



30 CAG ATC GTG TCA ACT GCT ACC CAA ACC TTC CTG GCA ACG TCC ATC 225 
Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys He 

65 70 75 



35 



AAT GGG GTA TGC TGG ACT GTC TAC 
Asn Gly Val Cys Trp Thr Val Tyr 

80 



CAC GGG GCC GGA ACG AGG ACC 
His Gly Ala Gly Thr Arg Thr 
85 90 



270 
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ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATG TAT ACC AAT GTG 
lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val 

95 100 105 



315 



5 GAC CAA GAC CTT GTG GGC TGG CCC GCT CCT CAA GGT TCC CGC TCA 360 
Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser 

110 115 120 

TTG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC CTG GTT ACG 405 
10 Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr 

125 130 135 



15 



ACG CAC GCC GAC GTC ATT CCC GTG CGC CGG CGA GGT GAT AGC AGG 450 
Arg His Ala Asp V&l lie Pro Val Arg Arg Arg Gly Asp Ser Arg 

140 145 150 



20 



GGT AGC CTG CTT TCG CCC CGG CCC ATT TCC TAC CTA AAA GGC TCC 495 
Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser 

155 160 165 



25 



TCG GOG GGT CCG CTG TTG TGC CCC GCG GGA CAC GCC GTG GGC CTA 540 
Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu 

170 175 180 

TTC AGG GCC GCG GTG TGC ACC CGT GGA GTG ACC AAG GCG GTG GAC 585 
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys Ala Val Asp 

185 190 195 



30 TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA TCC CCG GGG 630 
Phe lie Pro Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Gly 

200 205 210 



GTG CTC GTT GGC GGC GTC CTG GCT GCT CTG GCC GCG TAT TGC CTG 720 
35 Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu 

215 220 225 
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TCA ACA GGC TGC GTG GTC ATA GTG GGC AGG ATT GTC TTG TCC GGG 765 
Ser Thr Gly Cys Val Val He Val Oly Arg He Val Leu Ser Gly 

230 235 240 

5 AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAG GAG GAG TTC 810 
Lys Pro Ala He He Pro Asp Arg Glu Val Leu lyr Gin Glu Phe 

245 250 255 



GAT GAG ATG GAA GAG TGC CGG AAG AAA AAG AGA CGC AAG CTT AAT 855 
10 Asp Glu Met Glu Glu Cys Arg Lys Lys Lys Arg Arg Lys Leu Asn 

260 



15 

(2) INFORMATION FOR SEQ ID Na9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 711 base pairs 
20 (B) TYPE: nucleic add 

(C) STRANDEDNESS: single 
P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(ix) FEATURE: 
(A) NAME/KEY: 

GCG CCC ATC ACG GCG TAC GCC CAG CAG 
Ala Pro lie Thr Ala Tyr Ala Gin Gin 

30 1 5 



ACG AGA GGC CTC CTA GGG 45 
Thr Arg Gly Leu Leu Gly 
10 15 



TGT ATA ATC ACC AGC CTG ACT GGC 
Cys lie lie Thr Ser Leu Thr Gly 

20 

35 

GGT GAG CTC CAG ATC GTG TCA ACT 
Gly Glu Val Gin He Val Ser Thr 

35 



CGG GAC AAA AAC CAA GTG GAG 90 
Arg Asp Lys Asn Gin Val Glu 
25 30 

GCT ACC CAA ACC TTC CTG GCA 135 
Ala Thr Gin Thr Phe Leu Ala 
40 45 
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ACG TGC ATC AAT GOG GTA TGC TGG ACT GTC TAC CAC GGG GCC GGA 180 
Thr Cys lie Asn Gly Val Cys Trp Thr Val lyr His Gly Ala Gly 

50 55 60 

5 

ACG AGG ACC ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATG TAT 225 
Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr 

65 70 75 

10 ACC AAT GTG GAC CAA GAC CTT GTG GGC TGG CCC GCT CCT CAA GGT 270 
Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 

80 85 90 

TCC CGC TCA TTG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC 315 
15 Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu lyr 

95 100 105 



20 



CTG GTT ACG AGG CAC GCC GAC GTC ATT CCC GTG CGC CGG CGA GGT 360 
Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly 

110 115 120 



25 



GAT AGC AGG GGT AGC CTG CTT TCG CCC CGG CCC ATT TCC TAC CTA 405 
Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu 

125 130 135 

AAA CGC TCC TCG GGG GGT CCG CTG TTG TGC CCC GCG GGA CAC GCC 450 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala 

140 145 150 



30 GTG GGC CTA TTC AGG GCC GCG GTG TGC ACC CGT GGA GTG ACC AAG 495 
Val Gly Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys 

155 160 165 



GCG GTG GAC TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA 540 
35 Ala Val Asp Phe He Pro Val Glu Asn Leu Glu Thr Thr Met Arg 

170 175 180 
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TCC CCG GGG GTG CTC GTT GGC GGC GTC CTG GCT GCT CTG GCC GCG 585 
Ser Pro Gly Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 

185 190 195 



TAT TCC Cro TCA ACA GGC TGC GTG GTC ATA GTC GGC AGG ATT GTC 630 
lyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg He Val 

200 205 210 

10 TTC TCC GGG AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC 675 
Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr 

215 220 225 

CAG GAG TTC GAT GAG ATC GAA GAG AAG GAG ACA GAG 
15 Gin Glu Phe Asp Glu Met Glu Glu Lys Glu Thr Glu 

230 

(2) INFORMATION FOR SEQ ID NO:10: 

20 (i) SEQUENCE CHARACTERISUCS: 

(A) LENGTH: 855 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 
(A) NAME/KEY: 

ATG AGA GGA TCG CAT CAC CAT CAC CAT CAC ACG GAT CCG GCG CCC 
Met Arg Gly Ser His His His His His His Thr Asp Pro Ala Pro 
15 10 15 

35 ATC ACG GCG TAC GCC CAG CAG ACG AGA GGC CTC CTA GGG TGT ATA 45 
lie Thr Ala Oyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 

20 25 30 



30 
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ATC ACC ACC CTC5 ACT GGC CGG GAC AAA AAC CAA GTG GAG GGT GAG 90 
He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 

35 40 45 



10 



15 



25 



35 



GTC CAG ATC GTG TCA ACT GCT ACC CAA ACC TTC CTC GCA ACG TGC 135 
Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys 

50 55 60 



ATC AAT GGG GTA TGC TGG ACT GTC TAC CAC GGG GCC GGA ACG AGG 180 
He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg 

65 70 75 



ACC ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATC TAT ACC AAT 225 
Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn 

80 85 90 

20 GTG GAC CAA GAC CTT GTC GGC TGG CCC GCT CCT CAA GGT TCC CGC 270 
Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg 

95 100 X05 



TCA TTC ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC CTC GTT 315 
Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val 

110 115 120 



ACG AGG CAC GCC GAC GTC ATT CCC GTC CGC CGG CGA GGT GAT AGC 360 
Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
30 125 130 135 

AGG GGT AGC CTC CTT TCG CCC CGG CCC ATT TCC TAC CTA AAA GGC 405 
Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly 

140 145 150 



TCC TCG GGG GGT CCG CTC TTC TCC CCC GCG GGA CAC GCC GTC GGC 450 
Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly 

155 160 165 
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CTA TTC AGG GCC GCX3 GTG TCC ACC CGT GGA GTC ACC AAG GCG GTC . 495 
Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys Ala Val 

170 X75 180 

5 



10 



GAC TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA TCC CCG 540 
Asp Phe He Pro Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro 

185 190 195 

GGG GTG arc OTT GGC GGC GTC CTG GCT GCT CTG GCC GCG TAT TCC 585 
Gly Val Leu Val Gly Gly Val Leu Ala Ala. Leu Ala Ala Tyr Cys 

200 205 210 



15 CTC TCA ACA GGC TCC GTC GTC ATA GTC GGC AGG ATT GTC TTC TCC 630 
Leu Ser Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser 

215 220 225 

GGG AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC CAG GAG 675 
20 Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu lyr Gin Glu 

230 235 240 



25 



TTC GAT GAG ATC GAA GAG AAG GAG ACA GAG 705 
Phe Asp Glu Met Glu Glu Lys Glu Thr Glu 

245 250 



30 (2) INFORMATION FOR SEQ ID Nail: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nudeic add 

35 (C) STRANDEDNESS: double 

P) TOPOLOGY: double 



(ii) MOLECULE TYPE: cDNA 
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GA TCA CC6 GTC TAG ATCT 
T GGC CAG ATC TAGA 

5 (2) INFORMATION FOR SEQ ID N0:12: 

(i) SEQUENCE CHARACTERXSTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nudeic add 
1 0 (Q STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (ix) FEATURE: 

(A) NAME/KEY: 

CCG GTC COG AAG AAA AA6 AGA CGC TAG C 
AG GCC TTC TTT TTC TCT GCG ATC G 

20 

(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 79 base pairs 
25 (B) TYPE: nudeic add 

(Q STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(be) FEATURE: 
(A) NAME/KEY: 

CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC CAG GAA TTC 
GT TAA TAT GCA CTG TCC CTC CAA GAG ATC GTC CTT AAG 

35 

GAT GAG ATC GAA GAG T6C CGG AAG AAA AAG AGA CGC A 
CTA CTC TAC CTT CTC ACQ GCC TTC TTT TTC TCT GCG TTC GA 
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(2) INFORMATION FOR SEQ ID NO:14: 



PCTAJS96/06389 



CO SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 amino adds 
5 (B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Qi) MOLECULE TYPE: polypeptide 

10 

(ix) FEATURE: 

(A) NAME/KEY: NS4A Active Mutant 

61y Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys 
15 5 10 

(2) INFORMAHON FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERlSnCS: 
20 (A) LENGTH: 13 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 

(A) NAME/KEY: NS4A Active Mutant 

30 cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys 

5 10 

(2) INFORMATION FOR SEQ ID NO:16: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 
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P) TOPOIjOGY: linear 

(ii) MOLECULE TYPE: polypeptide 

5 (ix) FEATURE: 

(A) NAME/KEY: Soluble 5A/5B Substrate 

Asp Thr Olu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

10 Gly Lys 



(2) INFORMATION FOR SEQ ID NO:17: 

15 

0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(u) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 
25 (A) NAME/KEY: Mutant 5A/5B Substrate 

Asp Thr Glu Asp Val Val Ala Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

Gly 

30 (2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino adds 

(B) TYPE: amino add 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: polypeptide 
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(ix) FEATURE: 

(A) NAME/KEY: Mutant Soluble 5A/5B Substrate 

5 Asp Thr Glu Asp Val Val Ala Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

Oly Lys 

2) INFORMATION FOR SEQ ID NO:19: 

10 

(i) SEQUENCE CHARACTERISnCS: 

(A) LENGTH: 18 anuno adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 
1 5 (D) TOPOLOGY: linear 

(u) MOLECULE TYPE: polypeptide 

(be) FEATURE: 
20 (A) NAME/KEY: Soluble 5A/5B Substrate 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

Oly Lys Tyr 

25 

2) INFORMATION FOR SEQ ID NO.20: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 amino adds 
30 (B) TYPE- amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ei) MOLECULE TYPE: polypeptide 

35 

(be) FEATURE: 

(A) NAME/KEY: Soluble 5A/5B Substrate 
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Asp Thr Glu Asp Val Val Ala Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

Gly Lys Tyr 



2) INFORMATION FOR SEQ ID NO-^1: 

1 0 CO SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 

(u) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 

(A) NAME/KEY: Soluble 4B/5A Substrate 

20 



Trp lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu 

5 10 15 

Arg Asp lie Trp Asp 

25 

2) INFORMATION FOR SEQ ID NO-^: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 amino adds 
30 (B) TYPE: amino add 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: polypeptide 

35 

(ix) FEATURE: 

(A) NAME/KEY: histidine tag 
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Met Arg Gly Ser His His His His His His Thr Asp Pro 

5 10 

5 2) INFORMATION FOR SEQ ID NO*^: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino adds 

(B) TYPE: amino add 

1 0 (C) STRANDEDNES5: single 

P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: polypeptide 

15 (ix) FEATURE: 

(A) NAME/KEY: hydrophilic tail 



Arg Lys Lys Lys Arg Arg Lys Leu Asn 

20 5 

2) INFORMATION FOR SEQ ID Na^4: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 4 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 Oi) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 

(A) NAME/KEY: hydrophilic tail 



35 



Lys Glu Thr Glu 

2) INFORMATION FOR SEQ ID NO-^: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

Oi) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 
1 0 (A) NAME/KEY: hydrophiUc taU 



15 



20 



35 



Txp lie Ser Ser 61u Cys Thr Thr Pro Cys Ser Gly Ser Txp Leu 

5 10 15 

Arg Asp lie Trp Asp 

20 

(2) INFORMATION FOR SEQ ID NO:26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 
30 (A) NAME/KEY: NS4A Mutant 

Cro CTC GTT GGC GGC GTC CTG GCT GCT CTG GCC GCG TAT TGC CTG 45 
Val Ijeu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu 
15 10 15 



TCA ACA GGC TGC GTG GTC ATA GTG GGC AGG ATT GTC TTG TCC GGG 90 
Ser Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly 

20 25 30 
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AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC CAG GAG TTC 135 
Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe 

35 40 45 

5 



GAT GAG ATG GAA GAG TGC 
10 Asp Glu Met Glu Glu Cys 

50 

(2) INFORMATION FOR SEQ ID NO:27: 

1 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 base pairs 

(B) TYPE- nudeic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



25 



30 



(ii) MOLECULE TYPE cDNA 

(ix) FEATURE: 

(A) NAME/KEY: pNB182A4AHT 



ATG AGA GGA TCG CAT CAC CAT CAC CAT CAC ACG GAT CCG CCC ATC 45 
Met Arg Gly Ser His His His His His His Thr Asp Pro Pro He 
1 5 10 15 

ACG GCG TAC GCC CAG CAG ACG AGA GGC CTC CTA GGG TGT ATA ATC 90 
Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Cys He He 

20 25 30 



35 



ACC 
Thr 



AGC CTG ACT GGC CGG GAC AAA AAC CAA GTG GAG GGT GAG GTC 135 
Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 

35 40 45 
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CAG ATC GTG TCA ACT GCT ACC CAA ACC TTC CTG GCA ACG TGC ATC 180 
Oln lie Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys lie 

50 55 60 



AAT GGG GTA TCC TGG ACT GTC TAC CAC GGG GCC GGA ACG AGG ACC 225 
Asn Gly Val Cys Trp Thr Val lyr His Gly Ala Gly Thr Arg Thr 
10 65 70 75 

ATC GCA TCA CCC AAG GGT CCT GTC ATC CAG ATG TAT ACC AAT GTC 270 
He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val 

80 85 90 

15 

GAC CAA GAC CTT GTC GGC TGG CCC GCT CCT CAA GGT TCC CGC TCA 315 
Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser 

95 100 105 

20 

TTC ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC CTC GTT ACG 360 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr I*eu Val Thr 

110 115 120 

25 

AGG CAC GCC GAC CTC ATT CCC GTC CGC CGG CGA GGT GAT AGC AGG 405 
Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg 

125 130 135 

30 

GGT AGC CTC CTT TCG CCC CGG CCC ATT TCC TAC CTA AAA GGC TCC 450 
Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 

140 145 150 

35 



wo 96/35717 PCT/US96/06389 

-64- 

TCG GGG GGT CCX5 CTG TTG TGC CCC GCG GGA CAC GCC GTG GGC CTA 495 
Ser Gly Gly Pro Leu Ijeu Cys Pro, Ala Gly His Ala Val Gly Leu 

155 160 165 

5 TTC AGG GCC GCG GTG T6C ACC CGT GGA GTG ACC AAG GCG GTG GAC 540 
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Thr Lys Ala Val Asp 

170 175 180 



10 TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC ATG AGA TCC CCG GGG 585 
Phe lie Pro Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Gly 

185 190 195 



15 GTG CTC GTT GGC 
Val Leu Val Gly 



20 TCA ACA GGC TGC 
Ser Thr Gly Cys 

AAG CCG GCA ATT 
25 Lys Pro Ala He 

GAT GAG ATG GAA 
Asp Glu Met Glu 

30 



GGC GTC CTG GCT GCT CTG 
Gly Val Leu Ala Ala Leu 
200 205 

GTG GTC ATA GTG GGC AGG 
Val Val He Val Gly Arg 
215 220 

ATA CCT GAC AGG GAG GTT 
He Pro Asp Arg Glu Val 
230 235 

GAG TGC CGG AAG AAA AAG 
Glu Cys Arg Lys Lys Lys 
245 250 



GCC GCG TAT TGC CTG 630 
Ala Ala Tyr Cys Leu 

210 

ATT GTC TTG TCC GGG 720 
He Val Leu Ser Gly 

225 

CTC TAC CAG GAG TTC 765 
Leu Tyr Gin Glu Phe 

240 

AGA CGC AAG CTT AAT 810 
Arg Arg Lys Leu Asn 

255 



(2) INFORMATION FOR SEQ ID NO:28: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 base pairs 

(B) TYPE: nuddc add 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(u) MOLECULE TYPE: cDNA 

5 (ix) FEATURE: 

(A) NAME/KEY: Native NS4A 



10 TCA ACA TGG GTG CTC GTT GGC GGC GTC CTG GCT GCT CTG GCC GCG 45 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
15 10 15 

TAT TGC CTG TCA ACA GGC TGC GTG GTC ATA GTG GGC AGG ATT GTC 90 
15 Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg lie Val 

20 25 30 



TTG TCC GGG AAG CCG GCA ATT ATA CCT GAC AGG GAG GTT CTC TAC 135 
Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr 
20 35 40 45 



CAG GAG TTC GAT GAG ATG GAA GAG TGC 
Gin Glu Phe Asp Glu Met Glu Glu Cys 

50 

25 

(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 33 amino add residues 

(B) TYPE: nucleic add 

(C) STEIANDEDNESS: single 

(D) TOPOLOGY: linear 

35 fii) MOLECULE TYPE: polypeptide 

(ix) FEATURE: 

(A) NAME/KEY: Carboxl 33 mer of NS4A 
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Cys Val Val lie Val Gly Arg lie Val Leu Ser Gly Lys Pro Ala 

5 10 15 

• 

5 Zl« lie Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 

20 25 30 

Glu Glu Cys 

10 

(2) INFORMATION FOR SEQ ID NO30: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTIl 33 amino add residues 
15 (B) TYPE: nuddc add 

(Q STRANDEDNES5: single 
(D) TOPOLOGY: linear 

Qi) MOLECULE TYPE: polypeptide 

20 

(ix) FEATURE: 

(A) NAME/KEY: Carboxl 33 mer of NS4A of HCV-6K strain 

Ser Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala 
25 5 10 15 

He Val Pro Asp Arg Glu Leu Leu Tyr Gin Glu Phe Asp Glu Met 

20 25 30 

30 Glu Glu Cys 

2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 16 amino adds 

(B) TYPE: amino add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: polypeptide 
(ix) FEATURE: 

(A) NAME/KEY: Native 5A/5B Substrate 

5 

Asp Thr Clu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr 

5 10 15 

Oly 
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WE CLAIM: 

1* A soluble HCV substrate which encodes a nonstructural polyprotein 
of the HCV genome. 

5 

2. The HCV substrate of daim 1 further comprising a solubilizing motif 
attached to said substrate 

3. The HCV substrate of daim 2 wherein the solubilizing motif is 
1 0 comprised of an ioniziable amino add. 

4. The HCV substrate of daim 3 wherein the ionizable substrate is either 
arginine or lysine* 

15 5. The HCV substrate of daim 4 having a sequence defined by defined by 
SEQ ID NO: 16, SEQ ID Na 17, SEQ ID Na 18, SEQ ID NO: 19, SEQ ID 
NO: 20, and SEQ ID NO: 21. 

6. A nudeic add encoding the soluble HCV substrate of claims 1, 2, 3, 4 
20 or 5. 

7. A vector containing a nudeic add which encodes a soluble HCV 
substrate of claims 1, 2, 3, 4, or 5. 

25 8. A cell transfected or transformed with a vector containg a nudeic add 
which encodes a soluble HCV substrate of claims 1, 2, 3, 4, or 5. 



30 



35 
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